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DNA SEOIJF NCINft RV STE-P^ ir 
LIGATION A NDri.F.AV/y r.Tr 

5 Field nf thftfayrffliin 

The invention relates generally to methods for determining the nucleotide 
sequence of a polynucleotide, and more particularly, to a method of step-wise removal 
and identification of terminal nucleotides of a polynucleotide. 

10 BACKGROTTNTt 

Analysis of polynucleotides with currently available techniques provides a 
spectrum of information ranging from the confirmation that a test polynucleotide is the 
same or different than a standard sequence or an isolated fragment to the express 
identification and ordering of each nucleoside of the test polynucleotide. Not only are 
such techniques crucial for understanding the function and control of genes and for 
applying many of the basic techniques of molecular biology, but they have also become 
increasingly important as tools in genomic analysis and a great many non-research 
applications, such as genetic identification, forensic analysis, genetic counseling, medical 
diagnostics, and the like. In these latter applications both techniques providing partial 
sequence information, such as fingerprinting and sequence comparisons, and techniques 
providing full sequence determination have been employed, e.g. Gibbs et al. Proc. Natl. 
Acad. Sci., 86: 1919-1923 (1989); Gyllensten et al, Proc. Natl. Acad. Sci, 85: 7652- 
7656 (1988): Carrano et al. Genomics, 4:129-136 (1989);Caetano-Anolles et al, Mol. 
Gen. Genet., 235: 157-165 (1992); Brenner and Livak, Proc. Natl. Acad. Sci., 86: 
8902-8906 (1989); Green etal, PCR Methods and Applications, 1: 77-90 (1991); and 
Versalovic et al, Nucleic Acids Research, 19: 6823-6831 (1991). 

Native DNA consists of two linear polymers, or strands of nucleotides. Each 
strand is a chain of nucleosides linked by phosphodiester bonds. The two strands are 
held together in an antiparallel orientation by hydrogen bonds between complementary 
bases of the nucleotides of the two strands: deoxyadenosine (A) pairs with thymidine 
(T) and deoxyguanosine (G) pairs with deoxycytidine (C). 

Presently there are two basic approaches to DNA sequence determination: the 
dideoxy chain termination method, e.g. Sanger et al, Proc. Natl. Acad. Sci., 74: 5463- 
5467 (1977): and the chemical degradation method, e.g. Maxam et al, Proc. Natl. Acad. 
Sci., 74: 560-564 (1977). The chain termination method has been improved in several 
ways, and serves as the basis for all currently available automated DNA sequencing 
machines, e.g. Sanger et al, J. Mol. Biol., 143: 161-178 (1980); Schreier et al, J. Mol. 
Biol., 129: 169-172 (1979); Smith et al, Nucleic Acids Research, 13: 2399-2412 (1985); 
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Smith et al, Nature, 321: 674-679 (1987); Prober et al, Science, 238: 336-341 (1987); 
Section II, Meth. Enzymol., 155: 51-334 (1987); Church et al, Science, 240: 185-188 
(1988); Hunkapiller et al, Science. 254: 59-67 (1991); Bevan et al, PCR Methods and 
Applications, 1: 222-228 (1992). 
5 Both the chain termination and chemical degradation methods require the 

generation of one or more sets of labeled DNA fragments, each having a common origin 
and each terminating with a known base. Tht set or sets of fragments must then be 
separated by size to obtain sequence information. In both methods, the DNA fragments 
are separated by high resolution gel electrophoresis, which must have the capacity of 
10 distinguishing very large fragments differing in size by no more than a single nucleotide. 
Unfortunately, this step severely limits the size of the DNA chain that can be sequenced 
at one time. Sequencing using these techniques can reliably accommodate a DNA 
chain of up to about 400-450 nucleotides, Bankier et al, Meth. Enzymol.. 155: 51-93 
(1987); and Hawkins et al, Electrophoresis, 13: 552-559 (1992). 

Several significant technical problems have seriously impeded the application of 
such techniques to the sequencing of long target polynucleotides, e.g. in excess of 500- 
600 nucleotides, or to the sequencing of high volumes of many target polynucleotides. 
Such problems include i) the gel electrophoretic separation step which is labor intensive, 
is difficult to automate, and introduces an extra degree of variability in the analysis of 
data, e.g. band broadening due to temperature effects, compressions due to secondary 
structure in the DNA sequencing fragments, inhomogeneities in the separation gel, and 
the like; ii) nucleic acid polymerases whose properties, such as processivity, fidelity, rate 
of polymerization, rate of incorporation of chain terminators, and the like, are often 
sequence dependent; iii) detection and analysis of DNA sequencing fragments which are 
typically present in fmol quantities in spatially overlapping bands in a gel; iv) lower 
signals because the labeling moiety is distributed over the many hundred spatially 
separated bands rather than being concentrated in a single homogeneous phase, and v) 
in the case of single-lane fluorescence detection, the availability of dyes with suitable 
emission and absorption properties, quantum yield, and spectral resolvability, e.g. 
Trainor, Anal. Biochem., 62: 418-426 (1990); Connell et al, Biotechniques, 5: 342-348 
(1987): Karger et al, Nucleic Acids Research, 19: 4955-4962 (1991); Fung et al, U.S. 
patent 4,855,225; and Nishikawa et al, Electrophoresis, 12: 623-631 (1991). 

Another problem exists with current technology in the area.of diagnostic 
sequencing. An ever widening array of disorders, susceptibilities to disorders, 
prognoses of disease conditions, and the like, have been correlated with the presence of 
particular DNA sequences, or the degree of variation (or mutation) in DNA sequences, 
at one or more genetic loci. Examples of such phenomena include human leukocyte 
antigen (HLA) typing, cystic fibrosis, tumor progression and heterogeneity, p53 proto- 
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oncogene mutations, ras proto-oncogene mutations, and the like, e.g. Gyllensten et al, 
PCR Methods and Applications, 1: 91-98 (1991); Santamaria et al, International 
application PCT/US92/01675; Tsui et al, International application PCT/CA90/00267; 
and the like. A difficulty in determining DNA sequences associated with such 
conditions to obtain diagnostic or prognostic information is the frequent presence of 
multiple subpopulations of DNA, e.g. allelic variants, multiple mutant forms, and the 
like. Distinguishing the presence and identity of multiple sequences with current 
sequencing technology is virtually impossible, without additional work to isolate and 
perhaps clone the separate species of DNA. 

A major advance in sequencing technology could be made if an alternative 
approach was available for sequencing DNA that did not required high resolution 
separations, provided signals more amenable to analysis, and provided a means for 
readily analyzing DNA from heterozygous genetic loci. 

15 Summary n f the Inventing 

The invention provides a method of nucleic acid sequence analysis based on 
ligation and cleavage of probes at the terminus of a target polynucleotide. Preferably, 
repeated cycles of such ligation and cleavage are implemented in the method, and in ' 
each such cycle a nucleotide is identified at the end of the target polynucleotide and the 
target polynucleotide is shortened, such that further cycles of ligation, cleavage, and 
identification can take place. That is, preferably, in each cycle the target sequence is 
shortened by a single nucleotide and the cycles are repeated until the nucleotide 
sequence of the target polynucleotide is determined. 

An important feature of the invention is the probe employed in the ligation and 
25 cleavage events. A probe of the invention is a double stranded polynucleoude which (i) 
contains a recognition site for a nuclease, and (ii) preferably has a protruding strand 
capable of forming a duplex with a complementary protruding strand of the target 
polynucleoude. At each cycle in the latter embodiment, only those probes whose 
protruding strands form perfectly matched duplexes with the protruding strand of the 
30 target polynucleotide are ligated to the end of the target polynucleoude to form a 

ligated complex. After removal of the unligated probe, a nuclease recognizing the probe 
cuts the ligated complex at a site one or more nucleotides from the ligation site along 
the target polynucleotide leaving an end, usually a protruding strand, capable of 
participating in the next cycle of ligation and cleavage. An important feature of the 
35 nuclease is that its recognition site be separate from its cleavage site. As is described 
more fully below, in the course of such cycles of ligation and cleavage, the terminal 
nucleotides of the target polynucleotide are identified. 



20 
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In one aspect of the invention, more than one nucleotide at the terminus of a 
target polynucleotide can be identified and/or cleaved during each cycle of the method. 

Generally, the method of the invention comprises the following steps: (a) 
ligating a probe to an end of the polynucleotide, the probe having a nuclease recognition 
5 site; (b) identifying one or more nucleotides at the end of the polynucleotide; (c) 

cleaving the polynucleotide with a nuclease recognizing the nuclease recognition site of 
the probe such that the polynucleotide is shortened by one or more nucleotides; and (d) 
repeating steps (a) through (c) until the nucleotide sequence of the polynucleotide is 
determined. As is described more fully below, the order of steps (a) through (c) may 

10 vary with different embodiments of the invention. For example, identifying the one or 
more nucleotides can be carried out either before or after cleavage of the ligated 
complex from the target polynucleotide. Likewise, ligating a probe to the end of the 
polynucleotide may follow the step of identifying in some preferred embodiments of the 
invention. Preferably, the method further includes a step of removing the unligated 

15 probe after the step of ligating . 

Preferably, whenever natural protein endonucleases are employed as the 
nuclease, the method further includes a step of methylating the target polynucleotide at 
the start of a sequencing operation to prevent spurious cleavages at internal recognition 
sites fortuitously located in the target polynucleotide. 

20 The present invention overcomes many of the deficiencies inherent to current 

methods of DNA sequencing: there is no requirement for the electrophoretic separation 
of closely-sized DNA fragments; no difficult-to-automate gel-based separations are 
required; no polymerases are required for generating nested sets of DNA sequencing 
fragments; detection and analysis are greatly simplified because signal-to-noise ratios are 

25 much more favorable on a nucleotide-by-nucleotide basis, permitting smaller sample 
sizes to be employed; and for fluorescent-based detection schemes, analysis is further 
simplified because fluorophores labeling different nucleotides may be separately 
detected in homogeneous solutions rather than in spatially overlapping bands. 

The present invention is readily automated, both for small-scale serial operation 

30 and for large-scale parallel operation, wherein many target polynucleotides or many 
segments of a single target polynucleotide are sequenced simultaneously. Unlike 
present sequencing approaches, the progressive nature of the method-that is, 
determination of a sequence nucleotide-by-nucleotide--permits one to monitor the 
progress of the sequencing operation in real time which, in turn, permits the operation 

35 to be curtailed, or re-started, if difficulties arise, thereby leading to significant savings in 
time and reagent usage. Also unlike current approaches, the method permits the 
simultaneous determination of allelic forms of a target polynucleotide: As described 
more fully below, if a population of target polynucleotides consists of several 
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subpopulations of distinct sequences, e.g. polynucleotides from a heterozygous genetic 
locus, then the method can identify the proportion of each nucleotide at each position in 
the sequence. 

Generally, the method of the invention is applicable to all tasks where DNA 
5 sequencing is employed, including medical diagnostics, genetic mapping, genetic 
identification, forensic analysis, molecular biology research, and the like. 

grief Description pf the Drawings 
Figure la illustrates a preferred structure of a labeled probe of the invention. 
10 Figure lb illustrates a probe and terminus of a target polynucleotide wherein a 

separate labeling step is employed to identify one or more nucleotides in the protruding 
strand of a target polynucleotide. 

Figure 1c illustrates steps of an embodiment wherein a nucleotide of the target 
polynucleotide is identified by extension with a polymerase in the presence of labeled 
15 dideoxynucleoside triphosphates followed by their excision, strand extension, and strand 
displacement 

Figure Id diagrammatically illustrates an embodiment in which nucleotide 
identification is carried out by polymerase extension of a probe strand in the presence of 
labeled chain-terminating nucleoside triphosphates. 
20 Figure le diagrammatically illustrates an embodiment in which nucleotide 

identification is carried out by polymerase extension in the presence of unlabeled chain- 
terminating 3'-amino nucleoside triphosphates followed by ligation of a labeled probe. 

Figure If illustrates probe assembly at the end of a target polynucleotide having 
a 5* protruding strand. 

25 Figure lg illustrates probe assembly at the end of a target polynucleotide having 

a 3' protruding strand. 

Figure 2 illustrates the relative positions of the nuclease recognition site, ligation 
site, and cleavage site in a ligated complex. 

Figures: 3a through 3h diagrammatically illustrate the embodiment referred to 
30 herein as "double stepping," or the simultaneous use of two different nucleases in 
accordance with the invention. 

Figures 4a through 4d illustrate data showing the fidelity of nucleotide 
identification through ligation with a ligase. 

Figures 5a through 5c illustrate data showing nucleotide identification through 
35 polymerase extension. 
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Definitions 

As used herein "sequence determination" or "determining a nucleotide sequence" 
in reference to polynucleotides includes determination of partial as well as full sequence 
information of the polynucleotide. That is, the term includes sequence comparisons, 
5 fingerprinting, and like levels of information about a target polynucleotide, as well as the 
express identification and ordering of nucleosides, usually each nucleoside, in a target 
polynucleotide. 

"Perfectly matched duplex" in reference to the protruding strands of probes and 
target polynucleotides means that the protruding strand from one forms a double 

10 stranded structure with the other such that each nucleotide in the double stranded 
structure undergoes Watson-Crick base pairing with a nucleotide on the opposite 
strand. The term also comprehends the pairing of nucleoside analogs, such as 
deoxyinosine, nucleosides with 2-aminopurine bases, and the like, that may be employed 
to reduce the degeneracy of the probes. 

15 The term "oligonucleotide" as used herein includes linear oligomers of 

nucleosides or analogs thereof, including deoxyribonucleosides, ribonucleosides, and the 
like. Usually oligonucleotides range in size from a few monomeric units, e.g. 3-4, to 
several hundreds of monomeric units. Whenever an oligonucleotide is represented by a 
sequence of letters, such as "ATGCCTG," it will be understood that the nucleotides are 

20 in 5'->3' order from left to right and that "A" denotes deoxyadenosine, "C" denotes 
deoxycytidine, "G" denotes deoxyguanosine, and "T" denotes thymidine, unless 
otherwise noted. 

As used herein, "nucleoside" includes the natural nucleosides, including 2'-deoxy 
and 2'-hydroxyl forms, e.g. as described in Romberg and Baker, DNA Replication, 2nd 
25 Ed. (Freeman, San Francisco, 1992). "Analogs" in reference to nucleosides includes 

synthetic nucleosides having modified base moieties and/or modified sugar moieties, e.g. 
described generally by Scheit, Nucleotide Analogs (John Wiley, New York, 1980). 
Such analogs include synthetic nucleosides designed to enhance binding properties, 
reduce degeneracy, increase specificity, and the like. 

30 

Detailed Desc ription of the Invention 
The invention provides a method of sequencing nucleic acids which obviates 
electrophoretic separation of similarly sized DNA fragments and which eliminates the 
difficulties associated with the detection and analysis of spatially overlapping bands of 
35 DNA fragments in a gel or like medium. Moreover, the invention obviates the need to 
generate DNA fragments from long single stranded templates with a DNA polymerase. 

As mentioned above an important feature of the invention are the probes ligated 
to the target polynucleotide. Generally, the probes of the invention provide a 
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"platform" from which a nuclease cleaves the target polynucleotide to which probe is 
ligated. Probes of the invention can also provide a means for identifying or labeling a 
nucleotide at the end of the target polynucleotide. Probes do not necessarily provide 
both functions in every embodiment 
5 In one aspect of the invention, probes have the form illustrated in Figure la. In 

this embodiment, probes are double stranded segments of DNA having a protruding 
strand at one end 10, at least one nuclease recognition site 12, and a spacer region 14 
between the recognition site and the protruding end 10. Preferably, probes also include 
a label 16, which in this particular embodiment is illustrated at the end opposite of the 

10 protruding strand. The probes may be labeled by a variety of means and at a variety of 
locations, the only restriction being that the. labeling means selected does not interfere 
with the ligation step or with the recognition of the probe by the nuclease. 

In the above embodiment, whenever a nuclease leaves a 5' phosphate on the 
terminus of the target polynucleotide, it is sometimes desirable to remove the it, e.g. by 

15 treatment with a standard phosphatase, prior to ligation. This prevents undesired 
ligation of one of the strands, when the protruding strands of the probe and target 
sequence fail to form a perfectly matched duplex. This is particularly problematic with a 
mismatch occurs precisely at the nucleotide position where identification is sought 
Where such phosphatase treatment is employed, the "nick" remaining in the ligated 

20 complex after the initial ligation can be repaired by kinase treatment followed by a 
second ligation step. 

Preferably, embodiments of the invention employing the above type of probe 
comprise the following steps: (a) ligating a probe to an end of the polynucleotide 
having a protruding strand to form a ligated complex, the probe having a 

25 complementary protruding strand to that of the polynucleotide and the probe having a 
nuclease recognition site; (b) identifying one or more nucleotides in the protruding 
strand of the polynucleotide, e.g. by the identity of the ligated probe; (c) cleaving the 
ligated complex with a nuclease; and (d) repeating steps (a) through (c) until the 
nucleotide sequence of the polynucleotide is determined. The step of identifying can 

30 take place either before or after the step of cleaving. Preferably, the one or more 
nucleotides in the protruding strand of the polynucleotide are identified prior to 
cleavage. In further preference, the method also includes a step of removing unligated 
probe from the ligated complex. 

It is not critical whether protruding strand 10 of the probe is a 5' or 3' end. 

35 However, in this embodiment, it is important that the protruding strands of the target 
polynucleotide and probes be capable of forming perfectly matched duplexes to allow 
for specific ligation. If the protruding strands of the target polynucleotide and probe are 
different lengths the resulting gap can be filled in by a polymerase prior to ligation, e.g. 



-7- 



WO 95/27080 



PCTAJS95/03678 



as in "gap LCR" disclosed in Backraan et al, European patent application 91 100959.5. 
Such gap filling can be used as a means for identifying one or more nucleotides in the 
protruding strand of the target polynucleotide. Preferably, the number of nucleotides in 
the respective protruding strands are the same so that both strands of the probe and 
5 target polynucleotide are capable of being ligated without a filling step. Preferably, the 
protruding strand of the probe is from 2 to 6 nucleotides long. As indicated below, the 
greater the length of the protruding strand, the greater the complexity of the probe 
mixture that is applied to the target polynucleotide during each ligation and cleavage 
cycle. 

10 I" another aspect of the invention, the primary function of the probe is to 

provide a site for a nuclease to bind to the ligated complex so that the complex can be 
cleaved and the target polynucleotide shortened. In this aspect of the invention, 
identification of the nucleotides can take place separately from probe ligation and 
cleavage. This embodiment provides several advantages: First, sequence determination 

15 does not require that the protruding strand of the ligated probe be perfectly 

complementary to the protruding strand of the target polynucleotide, thereby permitting 
greater flexibility in the control of hybridization stringency. Second, one need not 
provide a fully degenerate set of probes based on the four natural nucleotides. So-called 
"wild card" nucleotides, or "degeneracy reducing analogs" can be provided to 

20 significantly reduce, or even eliminate, the complexity of the probe mixture employed in 
the ligation step, since specific binding is not critical to nucleotide identification in this 
embodiment. Third, if identification is not carried out via a labeling means on the probe, 
then probes designed for blunt end ligation may be employed with no need for using 
degenerate mixtures. 

25 Preferably, this embodiment of the invention comprises the following steps: (a) 

providing a polynucleotide having a protruding strand; (b) identifying one or more 
nucleotides in the protruding strand by extending a 3' end of a strand with a nucleic acid 
polymerase, (c) ligating a probe to an end of the polynucleotide to form a ligated 
complex; (d) cleaving the ligated complex with a nuclease; and (e) repeating steps (a) 

30 through (d) until the nucleotide sequence of the polynucleotide is determined. 

Preferably, the target polynucleotide has a 3* recessed strand which is extended by the 
nucleic acid polymerase in the presence of chain-terminating nucleoside triphosphates, 
and the nuclease used produces a 3'-recessed strand and 5' protruding strand at the 
terminus of the target polynucleotide. 

35 An example of this embodiment is illustrated in Figure lb: The 3' recessed 

strand of polynucleotide (15) is extended with a nucleic acid polymerase in the presence 
of the four dideoxynucleoside triphosphates, each carrying a distinguishable fluorescent 
label, so that the 3' recessed strand is extended by one nucleotide (1 1), which permits its 
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complementary nucleotide in the 5' protruding strand of polynucleotide (15) to be 
identified. Probe (9) having recognition site (12), spacer region (14), and 
complementary protruding strand (10), is then ligated to polynucleotide (15) to form 
ligated complex (17). Ligated complex (17) is then cleaved at cleavage site (19) to 
5 release a labeled fragment (21 ) and augmented probe (23). A shortened polynucleotide 
(15) with a regenerated 3' recessed strand is then ready for the next cycle of 
identification, ligation, and cleavage. 

In such embodiments, the first nucleotide of the 5* protruding strand adjacent to 
the double stranded portion of the target polynucleotide is readily identified by 
10 extending the 3' strand with a nucleic acid polymerase in the presence of chain- 
terminating nucleoside triphosphates. Preferably, the 3' strand is extended by a nucleic 
acid polymerase in the presence of the four chain-terminating nucleoside triphosphates, 
each being labeled with a distinguishable fluorescent dye so that the added nucleotide is 
readily identified by the color of the attached dye. Such chain-terminating nucleoside 
15 triphosphates are available commercially, e.g. labeled dideoxynucleoside triphosphates, 
such as described by Hobbs, Jr. et al, U.S. patent 5,047,519; Cruickshank, U.S. patent 
5,091,519; and the like. Procedures for such extension reactions are described in 
various publications, including Syvanen et al, Genomics, 8: 684-692 (1990); Goelet et 
al, International Application No. PCT/US92/01 905; Livak and Brenner, U.S. patent 
20 5,102,785; and the like. 

A probe may be ligated to the target polynucleotide using conventional 
procedures, as described more fully below. Preferably, the probe is ligated after a single 
nucleotide extension of the 3' strand of the target polynucleotide. More preferably, the 
number of nucleotides in the protruding strand of the probe is the same as the number of 
25 nucleotides in the protruding strand of the target polynucleotide after the extension step. 
That is, if the nuclease provides a protruding strand having four nucleotides, then after 
the extension step the protruding strand will have three nucleotides and the protruding 
strand of the preferred probe will have three nucleotides. 

The cleavage step in this embodiment may be accomplished by a variety of 
30 techniques, depending on the effect that the added chain-terminating nucleotide has on 
the efficiencies of the nuclease and/or ligase employed. Preferably, a ligated complex is 
formed with the presence of the labeled chain-terminating nucleotide, which is 
subsequently cleaved with the appropriate nuclease, e.g. a class lis restriction 
endonuclease, such as Fok I, or the like. 
35 111 a preferred embodiment, after extension and ligation, the chain-terminating 

nucleotide may be excised. Preferably, this is carried out by the 3'->5' exonuclease 
activity (i.e. proof-reading activity) of a DNA polymerase, e.g. T4 DNA polymerase, 
acting in the presence of the appropriate nucleoside triphosphates. By the action of this 
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enzyme, the chain-terminating nucleoside (1 1) is exchanged with a natural counterpart 
and the strand is extended, displacing the unligated probe strand (25). Conveniently, 
when probes having protruding stands are employed, this step simultaneously caps the 
target polynucleotides that failed to ligate to a probe in a preceding ligation step by 
5 "filling in" their ends thereby preventing subsequent ligation. 

Such excision may also be carried out chemically, provided that the labeled 
chain-terminating nucleoside is attached by a labile bond, such as an acid-labile 
phosphoramidate bond. Synthesis of such nucleoside phosphoramidates and their use 
with DNA polymerases are described in Letsinger et al, J. Am. Chem. Soc, 94: 292-293 
10 (1972) and Letsinger et al, Biochem., 15: 2810-2816 (1976). After identification, the 
phosphoramidate bond is cleaved and the nucleoside excised by mild acid to leave a 
terminal phosphate group which must be removed with a 3' phosphatase prior to the 
next cycle. 

In another embodiment, the chain-terminating nucleotide is excised and the 

15 recessed 3' strand extended before ligation leaving a blunt-ended target polynucleotide. 
A subsequence cycle is then initiated by ligation of a blunt-ended probe to the end of the 
target polynucleotide. The use of a probe with a blunt end eliminates the need to 
employ multiple probes, because there are no protruding strands that have to be 
hybridized in order for ligation to take place. 

20 In another variation of this embodiment, a nuclease is selected which leaves a 

one nucleotide 5' protruding strand after digestion, e.g. Alw I. Thus, chain extension 
need not be carried out in the presence of chain-terminating nucleoside triphosphates; 
ordinary deoxynucleoside triphosphates can be employed to leave a flush-ended 
polynucleotide. A blunt-ended probe is then used to initiate the next cycle. Preferably, 

25 the nucleoside triphosphates used are labeled, as would be the chain- terminating analogs 
described in the above embodiments. In further preference, the label is attached by way 
of a selectively cleavable bond, so that the label can be removed to enhance the 
efficiency of the nuclease in the subsequent cycle. Several such cleavable linkage 
moieties are available, e.g. Herman et al, Anal. Biochem., 156:48-55 (1986)(disulfide 

30 linker); Urdea U.S. patents 4,775,619 and 5,1 18,605. 

In yet another aspect of this embodiment, after ligation, a 3* end of a strand of 
the probe is extended with a DNA polymerase in the presence of labeled chain- 
terminating nucleoside triphosphates, as illustrated in Figure Id. There target 
polynucleotide (15) having a 3' protruding end is ligated to probe (130) having a 

35 complementary 3' protruding end (134) one nucleotide less in length. That is, when the 
3' protruding strand (134) of probe (130) has three nucleotides, the 3' protruding strand 
of target polynucleotide (15) would have at least four nucleotides. Ligation results in 
the formation of ligated complex (17) with gap (132). Gap (132) is then filled by 
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extending 3' protruding end (134) with a nucleic acid polymerase in the presence of 
chain-terminating nucleoside triphosphates. After cleavage, the cycle can be repeated. 

This embodiment may also be implemented with unlabeled chain-terminating 
nucleoside triphosphates, as illustrated in Figure le. Target polynucleotide (15) is 
successively exposed to different 3-aminonucleoside triphosphates in the presence of a 
nucleic acid polymerase (150). The 3-aminonucleoside triphosphates act as chain- 
tenminators when incorporated. For example, 3-aminoadenosine triphosphate (152) 
shown incorporated in Figure le stops further strand extension and reduces the length 
of the protruding strand by one nucleotide, from 4 to 3. After such exposure, probe 
(154) with label (155) corresponding to the adenosine chain-terminator is mixed with 
the target sequence for ligation (156). As the labeled probe has a protruding strand of 3 
nucleotides, it will only ligate if there has been an extension. If no ligation takes place, 
and no probe remains attached after washing, then the next 3'-aminonucleoside 
triphosphate and corresponding probe are tried. This process continues until the target 
polynucleotide is successfully extended and a corresponding probe is ligated to form 
ligated complex (17). The synthesis of 3-amtnonucleside triphosphates arc described in 
Kutateldze et al, FEBS Letters, 153: 420-426 (1983), Krayevsky et al, Biochimica et 
Biophysica Acta, 783; 216-220 (1984), and Herrlein et al, Helvetica Chimica Acta, 77: 
586-598 (1994). The ligation properties of oligonucleotides having terminal 3'- 
aminonucleoside is described in Fung and Gryaznov, International application 
PCT/US94/03087. The chain terminating properties of 3'-aminonucleo tides are 
described in Herrlein et al (cited above). 

In yet another embodiment of the invention, a probe is assembled at the end of a 
target polynucleotide in two steps, as illustrated by the example in Figure If. A first 
single stranded oligonucleotide (100) having a 5* monophosphate is annealed to and 
ligated with target polynucleotide (15) having a 5' monophosphate on its protruding 
strand to form a precursor (104) to ligated complex (17). A second single stranded 
oligonucleotide (102) complementary to the protruding strand of precursor (104) is 
annealed to and ligated with precursor (104) to form ligated complex (17). As with the 
double stranded probes described more fully below, first oligonucleotide (100) may be 
delivered to the target polynucleotide as a mixture and ligation preferably takes place at 
high stringency so that perfectly matched hybrids (between the protruding strand of the 
target polynucleotide and the 5' end of the first oligonucleotide) are preferentially 
ligated. Clearly, second oligonucleotide (102) need only have a sequence 
complementary to the protruding portion of precursor (104) so that a second ligation 
can take place to form ligated complex (17). 

In another form of this embodiment, illustrated in Figure lg, a first single 
stranded oligonucleotide (120) is annealed to and ligated with target polynucleotide (15) 
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having a 5* monophosphate on its recessed strand to form a precursor (124) to ligated 
complex (17). A second single stranded oligonucleotide (122) complementary to the 
protruding strand of precursor (124) and having a 5' monophosphate is annealed to and 
ligated with precursor (124) to form ligated complex (17). As with the double stranded 
5 probes described more fully below, first oligonucleotide (120) may be delivered to the 
target polynucleotide as a mixture and ligation preferably takes place at high stringency 
so that perfectly matched hybrids (between the protruding strand of the target 
polynucleotide and the 3' end of the first oligonucleotide) are preferentially ligated. As 
above, second oligonucleotide (122) need only have a sequence complementary to the 

10 protruding portion of precursor (124) so that a second ligation can take place to form 
ligated complex (17). 

The complementary strands of the probes are conveniently synthesized on an 
automated DNA synthesizer, e.g. an Applied Biosystems, Inc. (Foster City, California) 
model 392 or 394 DNA/RNA Synthesizer, using standard chemistries, such as 

15 phosphoramidite chemistry, e.g. disclosed in the following references: Beaucage and 
Iyer, Tetrahedron, 48: 2223-231 1 (1992); Molko et al, U.S. patent 4,980,460; Koster et 
al, U.S. patent 4,725,677; Caruthers et al, U.S. patents 4,415,732; 4,458,066; and 
4,973,679; and the like. Alternative chemistries, e.g. resulting in non-natural backbone 
groups, such as phosphorothioate, phosphoramidate, and the like, may also be employed 

20 provided that the resulting oligonucleotides are compatible with the ligation and 

cleavage reagents. After synthesis, the complementary strands are combined to form a 
double stranded probe. Generally, the protruding strand of a probe is synthesized as a 
mixture, so that every possible sequence is represented in the protruding portion. For 
example, if the protruding portion consisted of four nucleotides, in one embodiment 

25 four mixtures are prepared as follows: 

X3X2 . . . XjNNNA, 
XiX 2 - . . XjNNNC, 

30 

.X3X2 . . . XiNNNG, and 
X 2 X 2 . . . Xj_NNNT 

35 where the "NNNs" represent every possible 3-mer and the "Xs" represent the duplex 
forming portion of the strand. Thus, each of the four probes listed above contains 4^ or 
64 distinct sequences; or, in other words, each of the four probes has a degeneracy of 
64. For example, XjX 2 ... XjNNNA contains the following sequences: 
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x x x 2 . 


. . XiAAAA 


xix 2 . 


. . XiAACA 


x ± x 2 . 


. . XiAAGA 


XiX 2 . 


. . XiAATA 


XlX 2 . 


. . XjACAA 



10 

XiX 2 . . . XiTGTA 

XiX 2 . . . XiTTAA 

XiX 2 . . . XiTTCA 

XiX 2 . . . XiTTGA 

15 X X X 2 . . . XiTTTA 



Such mixtures arc readily synthesized using well known techniques, e.g. as disclosed in 
Telenius et al, Genomics, 13: 718-725 (1992); Welsh et al, Nucleic Acids Research, 19: 
5275-5279 (1991); Grothues et al, Nucleic Acids Research, 21: 1321-1322 (1993); 

20 Hartley, European patent application 90304496.4; and the like. Generally, these 

techniques simply call for the application of mixtures of the activated monomers to the 
growing oligonucleotide during the coupling steps where one desires to introduce the 
degeneracy. As discussed above, in some embodiments it may be desirable to reduce 
the degeneracy of the probes. This can be accomplished using degeneracy reducing 

25 analogs, such as deoxyinosine, 2-aminopurine, or the like, e.g. as taught in Kong Thoo 
Lin et al, Nucleic Acids Research, 20: 5149-5152, [or by] U.S. patent 5,002,867; 
Nichols et al, Nature, 369: 492-493 (1994); and the like. 

Preferably, for oligonucleotides with phosphodiester linkages, the duplex 
forming region of a probe is between about 12 to about 30 basepairs in length; more 

30 preferably, its length is between about 15 to about 25 basepairs. 

From the above, it is clear that the probes can have a wide variety of forms. For 
example, the probes can have the form XjX 2 ... XjANNN, XjX 2 ... X;NANN, XjX 2 
... XjNNAN, or the like. Or, the number of probe sets could be increased and the 
degeneracy reduced by constructing 16 sets of probes of 16-fold having the form: 

35 XiX 2 ... XjNNAA, X^ 2 ... X^NNAC, X^ 2 ... XjNNAG, and so on. 

It is not crucial that the duplex forming region of each such set of probes have 
the same length. Size differences among the probes can be used as a means for 
identifying them, e.g. Skolnick et al, Genomics, 2: 273-279 (1988). Also, in some 
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embodiments, it may be desirable to synthesize the probe as a single polynucleotide 
which contains self-complementary regions. After synthesis, the self-complementary 
regions are allowed to anneal to form a probe with a protruding strand at one end and a 
single stranded loop at the other end. Preferably, in such embodiments the loop region 
5 may comprise from about 3 to 10 nucleotides, or other comparable linking moieties, e.g. 
alkylether groups, such as disclosed in U.S. patent 4,914,210. Many techniques are 
available for attaching reactive groups to the bases or internucleoside linkages for 
labeling, as discussed below. 

When conventional ligases are employed in the invention, as described more 

10 fully below, the 5' end of the probe may be phosphorylated in some embodiments. A 5' 
monophosphate can be attached to a second oligonucleotide either chemically or 
enzymatically with a kinase, e.g. Sambrook et al, Molecular Cloning: A Laboratory 
Manual, 2nd Edition (Cold Spring Harbor Laboratory, New York, 1989). Chemical 
phosphorylation is described by Horn and Urdea, Tetrahedron Lett., 27: 4705 (1986), 

15 and reagents for carrying out the disclosed protocols are commercially available, e.g. 5' 
Phosphate-ON™ from Clontech Laboratories (Palo Alto, California). Thus, in some 
embodiments, probes may have the form: 



20 



5 ' " x l x 2 • • • XiTTGA 
*lY 2 ... Y iP 



the form: 



25 



5 • -pAGTTX^ . . - Xi 
. YiY 2 . . . Y A 



or the like, where the Y's are the complementary nucleotides of the X's and "p" is a 
monophosphate group. 

The probes of the invention can be labeled in a variety of ways, including the 

30 direct or indirect attachment of radioactive moieties, fluorescent moieties, colorimetric 
moieties, and the like. Many comprehensive reviews of methodologies for labeling 
DNA and constructing DNA probes provide guidance applicable to constructing probes 
of the present invention. Such reviews include Matthews et al, Anal. Biochem. . Vol 
169, pgs. 1-25 (1988); Haugland, Handbook of Fluorescent Probes and Research 

35 Chemicals (Molecular Probes, Inc., Eugene, 1992); Keller and Manak, DNA Probes, 
2nd Edition (Stockton Press, New York, 1993); and Eckstein, editor, Oligonucleotides 
and Analogues: A Practical Approach (IRL Press, Oxford, 1991); Wetmur, Critical 
Reviews in Biochemistry and Molecular Biology, 26: 227-259 (1991); and the like. 
Many more particular methodologies applicable to the invention are disclosed in the 



- 14- 





WO 95/27080 



PCT/US95/03678 



10 



15 



20 



25 



30 



following sample of references: Connolly, Nucleic Acids.Research, Vol. 15, pgs. 3131- 
3139 (1987); Gibson et al. Nucleic Acids Research, Vol. 15, pgs. 6455-6467 (1987); 
Spoat et al, Nucleic Acids Research, Vol. 15, pgs. 4837-4848 (1987); Fung et al, U.S. 
patent 4,757,141; Hobbs, Jr., et al U.S. patent 5,151,507; Cruickshank, U.S. patent 
5,091,519; (synthesis of functionalized oligonucleotides for attachment of reporter 
groups); Jablonski et al, Nucleic Acids Research, 14: 61 15-6128 (1986)(enzyme- 
oligonucleotide conjugates); and Urdea et al, U.S. patent 5,124,246 (branched DNA). 
Attachment sites of labeling moieties are not critical in embodiments relying on probe 
labels to identify nucleotides in the target polynucleotide, provide that such labels do not 
interfere with the ligation and cleavage steps. In particular, dyes may be conveniently 
attached to the end of the probe distal to the target polynucleotide on either the 3' or 5* 
termini of strands making up the probe, e.g. Eckstein (cited above), Fung (cited above), 
and the like. In some embodiments, attaching labeling moieties to interior bases or 
inter-nucleoside linkages may be preferred. 

Preferably, the probes are labeled with one or more fluorescent dyes, e.g. as 
disclosed by Menchen et al, U.S. patent 5,188,934; Begot et al PCT application 
PCT/US90/05565. 

In accordance with the invention, a probe of the invention is ligated to an end of 
a target polynucleotide to form a ligated complex in each cycle of ligation and cleavage. 
The ligated complex is the double stranded structure formed after probe and target are 
ligated, usually after the protruding strands of the target polynucleotide and probe 
anneal and at least one pair of the identically oriented strands are caused to be 
covalently linked to one another. Ligation can be accomplished either enzymatically or 
chemically. Chemical ligation methods are well known in the art, e.g. Ferris et al, 
Nucleosides & Nucleotides, 8: 407-414 (1989); Shabarova et al, Nucleic Acids 
Research, 19: 4247-4251 (1991); and the like. Preferably, however, ligation is carried 
out enzymatically using a ligase in a standard protocol. Many ligases are known and are 
suitable for use in the invention, e.g. Lehman, Science, 186: 790-797 (1974); Engler et 
al, DNA Ligases, pages 3-30 in Boyer, editor. The Enzymes, Vol. 15B (Academic 
Press, New York, 1982); and the like. Preferred ligases include T4 DNA ligase, T7 
DNA ligase, E. coli DNA ligase, Taq ligase, Pfu ligase, and Tth ligase. Protocols for 
their use are well known, e.g. Sambrook et al (cited above); Barany, PCR Methods and 
Applications, 1: 5-16 (1991); Marsh et al, Strategies, 5: 73-76 (1992): and the like. 
Generally, ligases require that a 5' phosphate group be present for ligation to the 3* 
hydroxyl of an abutting strand. This is conveniently provided for at least one strand of 
the target polynucleotide by selecting a nuclease which leaves a 5* phosphate, e.g. as 



Fok I. 
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In a preferred embodiment of the invention employing unphosphorylated probes, 
the step of ligating includes (i) ligating the probe to the target polynucleotide with a 
ligase so that a ligated complex is formed having a nick on one strand, (ii) 
phosphorylating the 5* hydroxyl at the nick with a kinase using conventional protocols, 
5 e.g. Sambrook et al (cited above), and (iii) ligating again to covalendy join the strands at 
the nick, i.e. to remove the nick. 

Preferably, a target polynucleotide for use in the invention is double stranded 
and is prepared so that it has a protruding strand at least one end. The protruding strand 
may be either 5' or 3' and, preferably, the number of nucleotides in the protruding 
10 portion of the strand is in the range of from 2 to 6. A target polynucleotide is referred 
to as "-k M where k is some integer, e.g. usually between 2 and 6, whenever the 5' strand 
is protruding. Conversely, a target polynucleotide is referred to as "+k" whenever the 3* 
strand is protruding. For example the following would be a -4 target polynucleotide in 
accordance with this nomenclature: 

15 

5 • -AACGTTTAC . . . 

AAATG . . . 

In one preferred embodiment of the invention, the target polynucleotide is 

20 anchored to a solid phase support, such as a magnetic particle, polymeric microsphere, 
filter material, or the like, which permits the sequential application of reagents without 
complicated and time-consuming purification steps. The length of the target 
polynucleotide can vary widely; however, for convenience of preparation, lengths 
employed in conventional sequencing are preferred. For example, lengths in the range 

25 of a few hundred basepairs, 200-300, to 1 to 2 kilobase pairs are preferred. 

The target polynucleotides can be prepared by various conventional methods. 
For example, target polynucleotides can be prepared as inserts of any of the 
conventional cloning vectors, including those used in conventional DNA sequencing. 
Extensive guidance for selecting and using appropriate cloning vectors is found in 

30 Sambrook et al, Molecular Cloning: A Laboratory Manual, Second Edition (Cold 

Spring Harbor Laboratory, New York, 1989), and like references. Sambrook et al and 
Innis et al, editors, PCR Protocols (Academic Press, New York, 1990) also provide 
guidance for using polymerase chain reactions to prepare target polynucleotides. 
Preferably, cloned or PCR-amplified target polynucleotides are prepared which permit 

35 attachment to magnetic beads, or other solid supports, for ease of separating the target 
polynucleotide from other reagents used in the method. Protocols for such preparative 
techniques are described fully in Wahlberg et al, Electrophoresis, 13: 547-551 (1992); 
Tong et al, Anal. Chem., 64: 2672-2677 (1992); Hultman et al, Nucleic Acids Research, 
17: 4937-4946 (1989); Hultman et al, Biotechniques, 10: 84-93 (1991); Syvanen et al, 
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Nucleic Acids Research, 16: 1 1327-1 1338 (1988); Dattagupta et al, U.S. patent 
4,734,363; Uhlen, PCT application PCT/GB89/00304; and like references. Kits arc also 
commercially available for practicing such methods, e.g. Dynabeads™ template 
preparation kit from Dynal AS. (Oslo, Norway). 
5 Populations of target polynucleotides may be prepared in parallel by the use of 

microparticles, e.g. magnetic beads, controlled pore glass particles, or the like, that each 
have a uniform population of adaptors attached. The adaptor is an oligonucleotide 
between about 30 and 100 nucleotides in length that comprises regions for PCR primer 
binding, regions that form restriction endonuclease cleavage sites when duplexes are 

10 established, and an address region of about 12-15 nucleotides that permits capture of a 
unique target polynucleotide by hybridization. Such adaptors may also comprise other 
linking moieties known in the art, e.g. polyethylene glycol arms, or the like. The 
population of adaptors on a particular microparticle is uniform in the sense that each 
oligonucleotide has the same sequence, so that the same target polynucleotide would be 

15 captured by different adaptors on the same microparticle. Preparation of microparticle 
with uniform populations of oligonucleotides is disclosed in PCT publications WO 
92/00091, WO 92/03461, and like references. For parallel sequencing, target 
polynucleotides are prepared in a library whose vector contains complementary address 
regions adjacent to the target polynucleotide insert. After excision and denaturing, the 

20 population of target polynucleotide-which now each have a complementary address 
region on its terminus-are mixed with a population of microparticles under conditions 
that permit capture. Individual particles with captured target polynucleotides may be 
isolated and manipulated on a microscope slide, e.g. as taught by Lam et al, PCT 
publication WO 92/00091 and Lam et al, Science, 354: 82-84 (1991). 

25 "Nuclease" as the term is used in accordance with the invention means any 

enzyme, combination of enzymes, or other chemical reagents, or combinations chemical 
reagents and enzymes that when applied to a ligated complex, discussed more fully 
below, cleaves the ligated complex to produce an augmented probe and a shortened 
target polynucleotide. A nuclease of the invention need not be a single protein, or 

30 consist solely of a combination of proteins. A key feature of the nuclease, or of the 
combination of reagents employed as a nuclease, is that its (their) cleavage site be 
separate from its (their) recognition site. The distance between the recognition site of a 
nuclease and its cleavage site will be referred to herein as its "reach." By convention, 
"reach" is defined by two integers which give the number of nucleotides between the 

35 recognition site and the hydrolyzed phosphodiester bonds of each strand. For example, 
the recognition and cleavage properties of Fok I is typically represented as 
"GGATG(9/13) M because it recognizes and cuts a double stranded DNA as follows: 
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5 ' - . . • NNGGATGNNNNNNNNN NNNNNNNNNN . . . 

3 ' - . . . NNCCTACNNNNNNNNNNNNN NNNNNN . . . 

5 where the bolded nucleotides are Fok I's recognition site and the N's are arbitrary 
nucleotides and their complements. 

It is important that the nuclease only cleave the target polynucleotide after it 
forms a complex with its recognition site; and preferably, the nuclease leaves a 
protruding strand on the target polynucleotide after cleavage. 

10 Cleavage with a nuclease can be accomplished using chemical nucleases, e.g. as 

disclosed by Sigman et al, Ann. Rev. Biochera., 59: 207-236 (1990); Le Doan et al, 
Nucleic Acid Research, 15: 7749-7760 (1987); U.S. patent 4,795,700; Francois et al, 
Proc. Natl. Acad. Sci., 86: 9702-9706 (1989); and like references. Preferably, such 
embodiments comprise an oligonucleotide moiety linked to a cleavage moiety, wherein 

15 the oligonucleotide moiety recognizes the ligated complex by triple helix formation. 
There is extensive guidance in the literature for selecting appropriate sequences, 
orientation, conditions, nucleoside type (e.g. whether ribose or deoxyribose nucleosides 
are employed), base modifications (e.g. methylated cytosine, and the like) in order to 
maximize, or otherwise regulate, triplex stability as desired in particular embodiments, 

20 e.g. Roberts et al, Proc. Natl. Acad. Sci., 88: 9397-9401 (1991); Roberts et al, Science, 
258: 1463-1466 (1992); Distefano et al, Proc. Natl. Acad. Sci., 90: 1 179-1 183 (1993); 
Mergny et al, Biochemistry, 30: 9791-9798 (1991); Cheng et al, J. Am. Chem. Soc., 
1 14: 4465-4474 (1992); Beal and Dervan, Nucleic Acids Research, 20: 2773-2776 
(1992); Beal and Dervan, J. Am. Chem. Soc, 1 14: 4976-4982 (1992); Giovannangeli et 

25 al, Proc. Natl. Acad. Sci., 89: 8631-8635 (1992); Moser and Dervan, Science, 238: 
645-650 (1987); McShan et al, J. Biol. Chem., 267:5712-5721 (1992); Yoon et al, 
Proc. Natl. Acad. Sci., 89: 3840-3844 (1992); Blume et al, Nucleic Acids Research, 20: 
1777-1784 (1992); and the like. Preferably, such chemical nucleases are employed with 
an exonuclease which can produce a protruding strand after cleavage. Although current 

30 chemical nucleases are limited in that their cleavage sites vary around an expected site, 
they can be employed in fingerprinting, sequence comparisons, and other uses that only 
require partial sequence information. 

Preferably, nucleases employed in the invention are natural protein 
endonucleases (i) whose recognition site is separate from its cleavage site and (ii) whose 

35 cleavage results in a protruding strand on the target polynucleotide. Most preferably, 
class Us restriction endonucleases are employed as nucleases in the invention, e.g. as 
described in Szybalski et al, Gene, 100: 13-26 (1991); Roberts et al, Nucleic Acids 
Research, 21: 3125-3137 (1993); and Livak and Brenner, U.S. patent 5,093,245. 
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Exemplary class lis nucleases for use with the invention include Alw XI, Bsm AI, Bbv I, 
Bsra FI, Sts I, Hga I, Bsc AI, Bbv II, Bee fl, Bee 851, Bcc I, Beg I, Bsa I, Bsg I, Bsp 
MI, Bst 71 1, Ear I, Eco 571, Esp 31, Fau I, Fok I, Gsu I, Hph I, Mbo E, Mme I, Rle AI, 
Sap I, Sfa NI, Taq II, Tth 1 1 III, Bco 51, Bpu AI, Fin I, Bsr DI, and isoschizoraers 
5 thereof. Preferred nucleases include Fok I, Hga I, Ear I, and Sfa NL 

Preferably, prior to nuclease cleavage steps, usually at the start of a sequencing 
operation, the target polynucleotide is treated to block the recognition sites and/or 
cleavage sites of the nuclease being employed. This prevents undesired cleavage of the 
target polynucleotide because of the fortuitous occurrence of nuclease recognition sites 

10 at interior locations in the target polynucleotide. Blocking can be achieved in a variety 
of ways, including methyiation and treatment by sequence-specific aptamers, DNA 
binding proteins, or oligonucleotides that form triplexes. Whenever natural protein 
endonucleases are employed, recognition sites can be conveniently blocked by 
methylating the target polynucleotide with the cognate methylase of the nuclease being 

15 used. That is, for most if not all type II bacterial restriction endonucleases, there exists a 
so-called "cognate" methylases that methylates its recognition site. Many such 
methylases are disclosed in Roberts et al (cited above) and Nelson et al, Nucleic Acids 
Research, 21: 3139-3154 (1993), and are commercially available from a variety of 
sources, particularly New England Biolabs (Beverly, MA). 

20 In accordance with the invention, after a probe is ligated to the target 

polynucleotide to form a ligated complex, the ligated complex is cleaved with a nuclease 
to give an augmented probe and a shortened target polynucleotide. This occurs because 
the probe is designed such that the distance from the recognition site of the probe to end 
of the probe is less than the distance from the recognition site to the cleavage site of the 

25 nuclease. That is, the nuclease necessarily cleaves in a region of the target 

polynucleotide, thereby shortening it by one or more nucleotides in each cycle, as 
illustrated in Figure 2. Conversely, in each cycle the probe has one or more nucleotides 
added to it after cleavage to form an augmented probe. In Figure 2, ligated complex 20 
is shown with recognition site 22 of the Fok I nuclease. The terminus 24 of the probe is 

30 one nucleotide to the left of the Fok I cleavage site 26. Thus, in the illustrated 

embodiment, ligation leads to the identification of the terminal thymidine on the target 
polynucleotide and cleavage results in the shortening of each strand of the target 
polynucleotide by one nucleotide. The nucleotides removed by the cleavage together 
with the probe to which they remain attached form an augmented probe. 

35 As mentioned above, the method of the invention is preferably carried out in the 

following steps: (a) ligating a probe to an end of the polynucleotide having a protruding 
strand to form a ligated complex, the probe having a complementary protruding strand 
to that of the polynucleotide and the probe having a nuclease recognition site; (b) 
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removing unligaied probe from the ligaled complex; (c) identifying one or more 
nucleotides in the protruding strand of the polynucleotide; (d) cleaving the ligated 
complex with a nuclease; and (e) repeating steps (a) through (d) until the nucleotide 
sequence of the polynucleotide is determined. Identification of the one or more 
5 nucleotides in the protruding strand of the target polynucleotide is carried out either 
before or after the cleavage step, depending on the embodiment of the invention being 
implemented. Identification of the one or more nucleotides in the protruding strand of 
the target polynucleotide is carried out either before or after the cleavage step, 
depending on the embodiment of the invention being implemented. Detection prior to 

10 cleavage is preferred in embodiments where sequencing is carried out in parallel on a 
plurality of sequences (either segments of a single target polynucleotide or a plurality of 
altogether different target polynucleotides), e.g. attached to separate magnetic beads, or 
other types of solid phase supports. Detection either before or after cleavage may be 
carried out in embodiments where a homogeneous population of target polynucleotides 

15 is being analyzed, e.g. a population of solid phase supports, such as magnetic beads, all 
have the identical target polynucleotide attached. In such cases, other factors my 
dictate the ordering of the detection and cleavage steps, such as the detection scheme 
being employed, whether the sequencing reactions are being carried out in separate 
reaction mixtures or whether they take place in a common mixture, and the like. 

20 In further preference, the method includes a capping step after the unligated 

probe is washed from the target polynucleotide. In a capping step, by analogy with 
polynucleotide synthesis, e.g. Andrus et al t U.S. patent 4,816,571, target 
polynucleotides that have not undergone ligation to a probe are rendered inert to further 
ligation steps in subsequent cycles. In this manner spurious signals from "out of phase" 

25 cleavages are prevented. When a nuclease leaves a 5' protruding strand on the target 
polynucleotides, capping is preferably accomplished by exposing the unreacted target 
polynucleotides to a mixture of the four dideoxynucleoside triphosphates, or other 
chain-terminating nucleoside triphosphates, and a DNA polymerase. The DNA 
polymerase extends the 3' strand of the unreacted target polynucleotide by one chain- 

30 terminating nucleotide, e.g. a dideoxynucleotide, thereby rendering it incapable of 
ligating in subsequent cycles. 

Clearly, one of ordinary skill in the art could combine features of the 
embodiments set forth above to design still further embodiments in accordance with the 
invention, but not expressly set forth above. 

35 An important aspect of the invention is "multiple stepping," or the simultaneous 

use of a plurality of nucleases which cleave at different distances from the ligation site to 
sequence a target polynucleotide. The use of multiple nucleases having different 
reaches permits one to periodically "restart" the sequencing process by capping 
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sequences involved in prior or current cycles of ligation and cleavage and by beginning a 
new cycle of ligation and cleavage on a "fresh" set of target polynucleotides whose 
protruding strands are exposed by cleavage with a long reach nuclease. By employing 
multiple nucleases in this manner the number of nucleotides that can be determined on a 
5 set of target polynucleotides can be increased over that which can be done with a single 
nuclease. 

In using multiple nucleases it is important that one be able to convert the 
protruding stand of a target polynucleotide from one form to another. For example, one 
may wish to apply both Fok I (which leaves a -4 target polynucleotide) and Ear I (which 

10 leave a -3 target polynucleotide) to a target sequence, i.e. "double stepping". As 

described more fully below, in order to do this, one must be able to convert the -4 target 
polynucleotide to a -3 target polynucleotide without loss of information. This can be 
accomplished by providing a conversion probe that has the following properties: i) a 
protruding strand compatible with the current target polynucleotide protruding strand, 

15 i.e. having the same number of nucleotides in antiparallel orientation, ii) a nuclease 
recognition site of the nuclease being converted to, and iii) a spacer region selected so 
that the cut site of the new nuclease corresponds to at least one of the ligation sites of 
the two strands. Preferably, the conversion probe permits ligation of only one strand 
and one of the unligated sites, i.e. nicks, is located at the cleavage site of the nuclease 

20 being converted to. 

Figures 3a through 3h diagrammatically illustrate this aspect of the invention in 
the case where two nucleases are employed, a first nuclease which permits cleavage ten 
nucleotides from the ligation site and a second nuclease which permits cleavage of one 
nucleotide from the ligation site. The process illustrated in figure is readily generalized 

25 to more than two nucleases. In Figure 3a, a mixture of probes 34 and 36 are ligated to 
the target polynucleotides 30 attached to solid phase support 32. Probe 34 contains a 
nuclease recognition site of a first nuclease that has a long reach, e.g. ten nucleotides, 
and a short spacer region so that its associated nuclease cleaves deeply into the target 
polynucleotide. Probe 36 converts (if necessary) the protruding strand of the target 

30 polynucleotides (initially prepared for the first nuclease) to a protruding strand 
corresponding to a second nuclease used to cleave the target polynucleotide one 
nucleotide at a time. With the appropriate protruding strand available, the second 
nuclease is employed in nine cycles of ligation and cleavage followed by a capping step 
to give the identity of the first nine nucleotides of the target polynucleotide. As 

35 illustrated in Figure 3b, capped sequences 38 no longer participate in ligation and 

cleavage cycles. The number of capped sequences produced in this step depends on the 
mixture of the two probes employed which, in turn, depends on several factors, 
including the length of the target polynucleotide, the nature of the label on the probes, 
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the efficiencies of ligation and cleavage of the enzymes employed, and the like. The 
target polynucleotides 41 are then cleaved at 40 with the first nuclease, shown in Figure 
3c, to produce appropriate protruding strands at the termini of the target 
polynucleotides and the identity of the tenth nucleotide. After cleavage and washing, a 
5 mixture of probes 34 and 36 are Iigated to the non-capped target polynucleotides 42 
(Figure 3d) to form Iigated complexes. The Iigated complexes including probe 36 are 
cleaved to convert the protruding strands of their associated target polynucleotides to 
protruding strands corresponding to the second nuclease, after which another nine 
cycles of ligation and cleavage take place followed by a capping step, to form a second 

10 set of capped sequences 44 (Figure 3e). In this series of cycles the identities of 
nucleotides 11 through 19 are determined. 

Next the target polynucleotides are cleaved with the first nuclease at 46 (in 
Figure 3f) to produce protruding strands on target polynucleotides 48, after which a 
mixture of probes 34 and 36 are Iigated to the target polynucleotides to form Iigated 

15 complexes 50 (Figure 3g). The Iigated complexes comprising probe 36 are again 

cleaved to convert the protruding strands of their associated target polynucleotides to 
ones corresponding to the second nuclease, after which nine cycles of ligation and 
cleavage take place followed by a capping step, to form a third set of capped sequences 
52 (Figure 3h). This set of cycles leads to the identification of nucleotides 21 through 

20 29. 

This process continues until the nucleotide sequence of the target polynucleotide 
is determined or until the remaining population of target polynucleotides is too small to 
generate a detectable signal. 

The invention includes systems and apparatus for carrying out sequencing 

25 automatically. Such systems and apparatus can take a variety of forms depending on 
several design constraints, including i) the nature of the solid phase support used to 
anchor the target polynucleotide, ii) the degree of parallel operation desired, iii) the 
detection scheme employed; iv) whether reagents are re-used or discarded, and the like. 
Generally, the apparatus comprises a series of reagent reservoirs, one or more reaction 

30 vessels containing target polynucleotide, preferably attached to a solid phase support, 
e.g. magnetic beads, one or more detection stations, and a computer controlled means 
for transferring in a predetermined manner reagents from the reagent reservoirs to and 
from the reaction vessels and the detection stations. The computer controlled means for 
transferring reagents and controlling temperature can be implemented by a variety of 

35 general purpose laboratory robots, such as that disclosed by Harrison et al, 

Biotechniques, 14: 88-97 (1993); Fujita et al, Biotechniques, 9: 584-591 (1990); Wada 
et al, Rev. Sci. Instrum., 54: 1569-1572 (1983); or the like. Such laboratory robots are 



-22- 



WO 95/27080 



PCT/DS95/03678 



also available commercially, e.g. Applied Biosystems model 800 Catalyst (Foster City., 
CA). 

A variety of kits are provide for carrying out different embodiments of the 
invention. Generally, kits of the invention include probes tailored for the nuclease and 
5 the detection scheme of the particular embodiment. Kits further include the nuclease 
reagents, the ligation reagents, and instructions for practicing the particular embodiment 
of the invention. In embodiments employing natural protein endonucleases and ligases, 
ligase buffers and nuclease buffers may be included. In some cases, these buffers may 
be identical. Such kits may also include a methylase and its reaction buffer and a kinase 

10 and its reaction buffer. Preferably, kits also include a solid phase support, e.g. magnetic 
beads, for anchoring target polynucleotides. In one preferred kit, fluorescently labeled 
probes are provided such that probes corresponding to different terminal nucleotides of 
the target polynucleotide carry distinct spectrally resolvable fluorescent dyes. As used 
herein, "spectrally resolvable" means that the dyes may be distinguished on basis of their 

15 spectral characteristics, particularly fluorescence emission wavelength, under conditions 
of operation. Thus, the identity of the one or more terminal nucleotides would be 
correlated to a distinct color, or perhaps ratio of intensities at different wavelengths. 
More preferably, four such probes are provided that allow a one-to-one correspondence 
between each of four spectrally resolvable fluorescent dyes and the four possible 

20 terminal nucleotides on a target polynucleotide. Sets of spectrally resolvable dyes are 
disclosed in U.S. patents 4,855,225 and 5,188,934; International application 
PCT/US90/05565; and Lee et al, Nucleic Acids Research, 20: 2471-2483 (1992). 

Example 1 

25 Sequencing a Target Polynucleotide 

Amplified from pTJClQ 
A 368 basepair fragment of pUC19 is amplified by PCR for use as a test target 
polynucleotide. The 5* terminal nucleotide of the coding strand is at position 393 and 
the 3' terminal nucleotide of the coding strand is at position 740, Yanisch-Perron et al, 

30 Gene, 33: 103-1 19 (1985), so that the polylinker region is spanned. Two primers 18- 
mer primers employed having sequences 5-AGTGAATTCGAGCTCGGT and 5'- 
xCC 111 G AGTG AGCTG ATA, where "x" is an amino linking group, Aminolinker II 
(Applied Biosystems, Inc., Foster City, California), to which a biotin moiety is attached 
using manufacturer's protocol, 5' Biotin NIO- Label Kit (Clontech Laboratories, Palo 

35 Alto, California). The amplified target polynucleotide is isolated and attached to 
streptavidin-coated magnetic beads (Dynabeads) using manufacturer's protocol, 
Dynabeads Template Preparation Kit, with M280-streptavidin (Dynal, Inc., Great Neck, 
New York). A sufficient quantity of the biotinylated 393 basepair fragment is provided 
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to load about 300 \lg of Dynabeads M280-Streptavidin. After loading onto the 
Dynabeads, the target polynucleotides are digested with Eco RI and washed to provide 
a 5-monophosphorylated protruding strand with an overhang of four nucleotides, i.e. a 
-4 target polynucleotide, shown below. 

5 

5 ' -pAATTCGAGCTCGGTACCCGGGGATCCTCTA . . . 

GCTCGAGCCATGGGCCCCTAGGAGAT . . . 

Reactions and washes below are generally carried out in 50 ^lL volumes of 
10 manufacturer's (New England Biolabs') recommended buffers for the enzymes 

employed, unless otherwise indicated. Standard buffers are also described in Sambrook 

et al, Molecular Cloning, 2nd Edition (Cold Spring Harbor Laboratory Press, 1989). 

Note that in this test example, methylation is not required because no Fok I recognition 

sequences are present in the target polynucleotide. 
15 The following four sets of mixed probes are provided for addition to the target 

polynucleotide: 



20 



TAMRA- ATCGGATGACATCAAC 

TAGCCTACTGTAGTTGANNN 

FAM- ATCGGATGACATCAAC 

TAGCCTACTGTAGTTGCNNN 



ROX- ATCGGATGACATCAAC 
25 TAGCCTACTGTAGTTGGNNN 

JOE- ATCGGATGACATCAAC 

TAGCCTACTGTAGTTGTNNN 

30 where TAMRA, FAM, ROX, and JOE are spectrally resolvable fluorescent labels 

attached by way of Aminolinker II (all being available from Applied Biosystems, Inc., 
Foster City, California); the bold faced nucleotides are the recognition site for Fok I 
endonuclease, and "N" represents any one of the four nucleotides, A, C, G, T. TAMRA 
(tetramethylrhodamine), FAM (fluorescein), ROX (rhodamine X), and JOE (2\7- 

35 dimethoxy-4',5'-dichlorofluorescein) and their attachment to oligonucleotides is also 
described in Fung et al, U.S. patent 4,855,225. 

Each of the above probes is separately incubated in sequence in approximately 5 
molar excess of the target polynucleotide ends as follows: the probe is incubated for 60 
minutes at 16°C with 200 units of T4 DNA ligase and the anchored target 

40 polynucleotide in 50 [iL of T4 DNA ligase buffer, after washing, the target 

polynucleotide is then incubated with 100 units T4 polynucleotide kinase in the 
manufacturer's recommended buffer for 30 minutes at 37°C, washed, and again 
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incubated for 30 minutes at 16°C with 200 units of T4 DNA ligase and the anchored 
target polynucleotide in 50 |iL of T4 DNA ligase buffer. Washing is accomplished by 
immobilizing the magnetic bead support with a magnet and successively adding then 
removing 50 pJL volumes of wash buffer, e.g. TE, disclosed in Sambrook et al (cited 
5 above). After the cycle of ligation-phosphorylation-ligation and a final washing, the 
beads are interrogated for the presence of fluorescent label. On the fourth set of such 
incubations, the characteristic fluorescence of JOE is detected indicating that the 
terminal nucleotide is A. The labeled target polynucleotide, i.e. the ligated complex, is 
then incubated with 10 units of Fok I in 50 (iL of manufacturer's recommended buffer 
10 for 30 minutes at 37°C, followed by washing in TE. As a result the target 

polynucleotide is shortened by one nucleotide on each strand and is ready for the next 
cycle of ligation and cleavage. The process is continued until the desired number of 
nucleotides are identified. 

15 Example 2 

Converting a -4 Protruding Strand to a 
-3 Protruding Strand 
A -4 protruding strand is converted into a -3 protruding strand using the 
conversion probe shown below having an Ear I recognition site (indicated in bold) and a 
20 protruding strand whose terminal nucleotide is non-phosphorylated (indicated in lower 
case). The conversion probe is ligated to the terminus of the target polynucleotide using 
conditions as described in Example 1 : 

ACTCTTC + pNNNNTACCGG . . . 

25 TGACAAGNNNn ATGGCC . . . 

I 

ACTCTTCNNNNTACCGG . . . 
30 TGAGAAGNNNnATGGCC . . . 

After ligation, the complex is digested with Ear I using manufacturer's recommended 
protocol to give a target polynucleotide with a -3 protruding strand: 

35 ACTCTTCNNNNTACCGG . . . 

TGAGAAGNNNnATGGCC . . . 

i 

40 
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ACTCTTCN 
TGACAAGNNNn 



pNNNTACCGG 
ATGGCC 



10 



Example 3 

Converting a -4 Protruding Strand to a 

-5 Protruding Strand 
A -4 protruding strand is converted into a -5 protruding strand using the 
conversion probe shown below having an Hga I recognition site (indicated in bold) and 
a protruding strand whose terminal nucleotide is non-phosphorylated (indicated in lower 
case). The conversion probe is ligated to the terminus of the target polynucleotide using 
conditions as described in Example 1: 



15 



AGACGCCATCAT 
TCTGCGGTAGTANNNn 



pNNNNTACCGG 
ATGGCC 



20 



AGACGCCATCATNNNNTACCGG 
TCTGCGGTAGTANNNnATGG C C 



25 



After ligation, the complex is digested with Hga I using manufacturer's recommended 
protocol to give a target polynucleotide with a -5 protruding strand: 



AGACGCCATCATNNNNTACCGG 
TCTGCGGTAGTANNNnATGG CC 



30 



AGACGC CATC A 
TCTGCGGTAGTANNNn 



pTNNNNTACCGG 
ATGGCC 



35 



40 



Example 4 

Converting a +2 Protruding Strand to a 
-5 Protruding Strand 
A +2 protruding strand is converted into a -3 protruding strand using the 
conversion probe shown below having an Ear I recognition site (indicated in bold) and a 



-26- 



WO 95/27080 PCT/US95/03678 



protruding strand whose terminal nucleotide is non-phosphorylated (indicated in lower 
case). The conversion probe is ligated to the terminus of the target polynucleotide using 
conditions as described in Example 1 : 



ACTCTTCGNN + pTACCGG 

TGACAAGc NNATGGCC 



10 

ACTCTTCGNNTACCGG . . . 
TGAGAAGcNNATGGCC . . . 

15 After ligation, the complex is digested with Ear I using manufacturer's recommended 
protocol to give a target polynucleotide with a -3 protruding strand: 

ACTCTTCGNNTACCGG . . . 
TGAGAAGcNNATGGCC . . . 

20 



i 



25 ACTCTTCG + NNA + pNNTACCGG 

TGACAAGc ATGGCC 



Example 5 

30 Double Stepping: Sequencing by Ligation Employing 

Two Restriction Endpnucfcases 
Two nucleases, Ear I and Fok I, with different reaches are used in the same 
sequencing operation. The procedure is illustrated in Fig. 3. A 368 basepair fragment 
of pUC19 with a -4 protruding strand is prepared as described in Example 1. Because 
35 the fragment contains a Ear I site (but no Fok I site), the target polynucleotide is initially 
treated with an Ear I methyiase, e.g. as described in Nelson et al, Nucleic Acids 
Research, 17: r398-r415 (1989). Afterwards, a 9: 1 mixture of the following two 
probes, Probe A:Probe B, is combined in about 5 molar excess with the target 
polynucleotide, ligated, kinased, and ligated, as described in Example 1 to form two 
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populations of ligaied complexes: about 10% terminating with Probe B and about 90% 
terminating with Probe A. 

Probe (A): ATCGGATG (Fok I recognition site) 

5 TAGCCTACNNNn 

Probe (B) : CAGATCCTCTTCa (Ear I recognition site) 

GTCTAGGACAAGTNNNN 

10 The target polynucleotide are then digested with Ear I to convert about 10% of the 
ligated complexes to a target polynucleotide having a -3 protruding strand. The 
following probes are then used in nine cycles of ligation-phosphorylation- 
ligation/identification/cleavage as described in Example 1 to give the identity of the first 
nine nucleotides. 

15 

Ear I Probes 

TAMRA- CAGATCCTCTTC 

GTCTAGGAGAAGGNN 

20 

FAM- CAGATCCTCTTC 

GTCTAGGAGAAGCNN 

ROX- CAGATCCTCTTC 
25 GTCTAGGAGAAGANN 

JOE- CAGATCCTCTTC 

GTCTAGGAGAAGTNN 

30 After the ninth cleavage and washing, the subpopulation of target polynucleotides that 
underwent the nine cycles of cleavage are capped by treating with a DN A polymerase in 
the presence of the four dideoxynucleoside triphosphates. After washing again, the 
target polynucleotides are digested with Fok I to give target polynucleotides with a -4 
protruding strand. Thus, at this point 10% of the original population of target 

35 polynucleotides is 9 nucleotides shorter (on average) and capped and 90% are precisely 
9 nucleotides shorter and ready for successive cycles of cleavage and ligation. 

To the Fok I digested target polynucleotides is added an 8: 1 mixture of Probe 
A:Probe B in a ligase buffer as described above. This results in approximately the same 
quantity of target polynucleotide being prepared for Ear I digestion as above. 

40 Alternatively, a constant ratio of Probe A:Probe B could be employed throughout the 
sequencing operation, which would lead to a less intense signal at each successive Fok I 
digestion step, but may also permit a longer sequence to be determined. Ear I is added 
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to the resulting mixed population of ligated complexes under the manufacturer's 
recommended protocol to convert a subpopulation to target polynucleotides with -3 
protruding strands. The Ear I probes are again applied nine times as described above to 
provide the identity of nucleotides 10 through 18. The process is continued as 
5 described above until the identities of the 90 terminal nucleotides of the target 
polynucleotide are obtained. 

Example 6 

Sequencing a Ta rget Polynucleotide 

10 Amplified from pGEM7Z: Identification of 

Nucleotides bv th e Ligation Reaction 
In this example, a segment of plasmid pGEM7Z (Promega, Madison, WI) was 
amplified and attached to glass beads via a double stranded DN A linker, one strand of 
which was synthesized direcdy onto (and therefore covalendy linked to) the beads. In 

15 each sequencing cycle after ligation, an aliquot of beads was removed from the reaction 
mixture and loaded onto a gel electrophoresis column for analyzing the non-covalently 
bound strand of the ligated complex. The probes were designed so that the non- 
covalendy bound strand would always carry a fluorescent label for analysis. 

A 47-mer oligonucleotide was synthesized direcdy on KF169 Ballotini beads 

20 using a standard automated DNA synthesizer protocol The complementary strand to 
the 47-mer was synthesized separately and purified by HPLC. When hybridized the 
resulting duplex has a Bst XI restriction site at the end distal from the bead. The 
complementary strand was hybridized to the attached 47-mer in the following mixture: 
25 ill complementary strand at 200 pmol/|il; 20 mg KF169 Ballotini beads with the 47- 

25 mer; 6 )il New England Biolabs #3 restriction buffer, and 25 JJ.1 distilled water. The 
mixture was heated to 93°C and then slowly cooled to 55°C, after which 40 units of 
Bst XI (at 10 units/|il) was added to bring the reaction volume to 60 (il. The mixture 
was incubated at 55°C for 2 hours after which the beads were washed three times in TE 
(pH 8.0). 

30 The segment of pGEM7Z to be attached to the beads was prepared as follows: 

Two PCR primers were prepared using standard protocols: 

Primer 1 : 5 ' -CTAAACCATTGGTATGGGCCAGTGAATTGTAATA 

35 Primer 2: 5 ' -CGCGCAGCCCGCATCGTTTATGCTACAGACTGTC- 

AGTGCAGCTCTCCGATCCAAA 
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The PCR reaction mixture consisted of the following: 1 \il pGEM7Z at I ng/|il; 10 ^1 
primer 1 at 10 pmol/^1; 10 primer 2 at 10 pmol/^tl; 10 \il deoxyribonucleotide 
triphosphates at 2.5 raM; 10 yl lOx PCR buffer (Perkin-Elmer); 0.5 ^1 Taq DNA 
polymerase at 5 units/^il; and 58 distilled water to give a final volume of 100 \il The 
5 reaction mixture was subjected to 25 cycles of 93°C for 30 sec; 60°C for 15 sec; and 
72°C for 60 sec, to give a 172 basepair product, which was successively digested with 
Bbv I (100 ^1 PCR reaction mixture, 12 |xl lOx # 1 New England Biolabs buffer, 8 |jJ 
Bbv I at 1 unit/|il incubate at 37°C for 6 hours) and with Bst XI (to the Bbv I reaction 
mixture was added: 5 |il 1 M NaCl, 67 \Jd distilled water, and 8 |JJ Bst XI at 10 units/^1, 

10 and the resulting mixture was incubated at 55°C for 2 hours). 

After passing the above reaction mixture through a Centricon 30 (Amicon, Inc.) 
spin column following manufacturer's protocol, the Bbv I/Bst Xl-restricted fragment 
was ligated to the double stranded linker attached to the Ballotini beads in the following 
mixture: 17 |il Bbv VBsl Xl-restricted fragment (10 jag), 10 nl beads (20 mg), 6 ml lOx 

15 ligation buffer (New England Biolabs, referred to below as NEB), 5 |il T4 DNA ligase 
at 2000 units/)il, and 22 (il distilled water, which mixture was incubated at 25°C for 4 
hours, after which the beads were washed 3 times with TE (pH 8.0), leaving the 
following target polynucleotide for sequencing: 

20 ... TCTGTAGCT 



The strands of the following probes (24 nucleotides in labeled strand and 18 
nucleotides in non-labeled strand) were separately synthesized on an automated DNA 
25 synthesizer (model 392 Applied Biosystems, Foster City) using standard methods: 



[BEAD] — 



AG AC ATCG AATTT - 5 



FAM 



30 



5 ' -pGNNNTACGTGCGCATCCCGAGCQA 
ATGCACGCGTAGGGCTCG- 5 • 



TAMRA 



35 



5 ' -pTNNNTACGTGCGCATCCCGAGCQA 
ATGCACGCGTAGGGCTCG- 5 * 



ROX 



5'- 



pCNNNTACGTGCGCATCCCGAGCQA 
ATGCACGCGTAGGGCTCG- 5 ' 



40 
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JOE 



5 ' -pANNNTACGTGCGCATCCCGAGCQA 
ATGCACGCGTAGGGCTCG- 5 ' 



10 



15 



20 



25 



30 



where p is a monophosphate, N indicates A, C, G, or T, Q is a branched linker carrying 
a protected amino group for attachment of a label (e.g. Uni-Link AminoModifier, 
available from Clontech Laboratories, Palo Alto, CA), and FAM, TAMRA, ROX, and 
JOE are as defined above. 5.0 x 104 pmol of each probe was combined in TE to form 
a mixture at a concentration of 1000 pmol/jil 

Ligations were carried out in a mixture consisting of 5 |il beads (20 mg), 3 |il 
NEB lOx ligase buffer, 5 pi probe mix, 2.5 \il NEB T4 DNA ligase (2000 units/|il), and 
14.5 \il distilled water. The mixture was incubated at 16°C for 30 minutes, after which 
the beads were washed 3 times in TE (pH 8.0). Cleavages were carried out in a mixture 
consisting of 5 [il beads (20 mg), 3 fxl lOx NEB buffer #3, 3 \il NEB Fok I (4 units/^1), 
and 19 |il distilled water. The mixture was incubated at 37°C for 30 minutes, after 
which the beads were washed 3 times in TE (pH 8.0). 

After each ligation, a sample of the beads with the ligated complex was removed 
for size analysis on a model 373 DNA sequencer using 672 GeneScan software (Applied 
Biosystems). The readout of the system provides a different colored curve for 
fragments labeled with the four different dyes (black for TAMRA, blue for FAM, green 
for JOE, and red for ROX). A 6% denaturing (8 M urea) polyacrylamide gel was 
employed in accordance with manufacturer's protocols. About 0.5 mg of beads were 
placed in 4 p.1 of formamide loading buffer in accordance with the manufacturer's 
protocol for analyzing sequencing fragments. Samples were heated to 95°C for 2 min 
then cooled by placing on ice, after which the entire sample was loaded into one lane. 

Results of four cycles of ligation are shown in Figures 4a through 4d. Curve a 
of figure 4a demonstrates that the first nucleotide in the target sequence is correctly 
identified as A. The first nucleotide is the one in the protruding strand closest to the 
double stranded portion of the target polynucleotide. Curves sj and S2 are 172 and 186 
nucleotide size standards. The very low curves indicated by "b" in the figure show that 
the fidelity of the ligase was very high, in that little or no other probes besides the 
correct one were ligated. Curve c in figure 4b demonstrates that the second nucleotide 
of the target polynucleotide is correctly identified as A. Note that as in figure 4a, only 
an insignificant number probes were incorrectly ligated, as indicated by "d". Figure 4c is 
a superposition of curve c of figure 4b onto the curve of figure 4a. This shows that 
curve c corresponds to a fragment one nucleotide shorter than that of curve a, as 
expected after the Fok f digestion. Figure 4d is a superposition of the data on the 
fragments generated in cycles 2, 3, and 4, indicated by curves e, f, and g, respectively. 
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10 



15 



20 



25 



30 



Again, the fidelity of ligation is very high and the peaks of the curves are in the correct 
order, as expected from the one nucleotide size reduction that takes place after each 
Fok I digestion. 



In this example, a segment of plasraid pGEM7Z was amplified by PCR using a 
biotinylated primer and attached by the biotin to strepavidinated magnetic beads. After 
each cleave step, the resulting protruding strand of the target polynucleotide was used 
as a template to extend the recessed strand by one nucleotide using a DNA polymerase 
in the presence of a mixture of labeled dideoxynucleoside triphosphates. The extended 
strand was then analyzed by gel electrophoresis as described above. 

The PCR reaction was prepared by combining the following: 1 fil pGEM7Z 
plasmid (1 pg/p.1), 1 p.1 B002 biotinylated primer (100 pmoles/M-1), 1 *il -337 primer (100 
pmole/|il), 20 ^1 10 nucleoside triphosphates (2.5 mM stock of each triphosphate), 20 \x 
1 lOx Taq buffer (Perkin-Elraer), 156 distilled water, and 1 jil Taq (2 units/ml). The 
primers had the following sequences: 

B002 : 5 ' -biotin-CCCGACGTCGCATGCTCCTCTA 

-337: 5 1 -GCGCGTTGGCCGATTCATTA 

The above PCR mixture was cycled 25 times through the following temperatures in a 
Perkin-Elmer 9600 thermal cycler: 94°C 1 min, 52°C 1 min, and 72°C 2 min. After 
cycling, to the reaction mixture was added 10 |ig glycogen and 100 |il chloroform, after 
which the aqueous phase was removed and combined with 20 |il 3 M NaOAc and 500 \i 
1 ethanol. After the resulting mixture was spun in a microfuge for 30 min, the 
precipitate was collected, dried, and resuspended in 50 |il H2O. Prior to combining 
with the biotinylated DNA, the strepavidinated magnetic beads (20 |il) were washed 3 
times with 100 Hi of 2x bead wash (1.0 M NaCl, Tris, triton X-100) and then 
resuspended in 10 |il of 2x bead wash. 10 ml of the biotinylated DNA solution was 
added to the beads and allowed to sit for 5 min with agitation, after which the beads 
were magnetically pulled to the side of the tube, the supernatant removed, and the beads 
washed twice with 2x bead wash and 3 time with water. 



Example 7 

Sequencing a Target Polynucleotide 

Amplified from pGEM7Z: Identification of 

Nucleotide by a Polymerase Extension Reaction 
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An initial protruding strand was produced at the end of the attached target 
polynucleotide by cleaving with Fok I as follows: To the beads were added: 44 ^1 
H 2 0, 5 \\1 lOx Fok I buffer (New England Biolabs), and 1 ^1 Fok I (New England 
Biolabs, 4 units/|il). The mixture was incubated for 30 min, after which the supernatant 
5 was removed from the magnetic beads. After this initial cleavage, three cycles of 
extension, ligation, excision, and cleavage were carried out with the following 
protocols. After each extension a sample of beads were removed from the reaction 
mixture and the labeled strand of the target polynucleotide was analyzed as described in 
Example 6. 

10 Extension reactions were carried out with Sequenase DNA polymerase in the 

presence of labeled dideoxynucleosides by adding to the beads the following mixture: 
17.0 \i\ H20, 5.0 5x Sequenase buffer, 2.5 jxl lOx Taq fluorescent dye-labeled 
terminators (Perkin-Elmer), and 1.0 |il Sequenase 2.0 (13 units/jil). After incubation at 
37°C for 15 min, proteinaceous material was extracted with 50 ^1 phenol/chloroform, 

15 which was then back extracted with 25 \il H2O. The combined aqueous phases were 
again extracted with 50 |il chloroform, after which the aqueous phase was removed and 
mixed with 5^il3M NaOAc and 125 ^1 ethanol. The precipitate was collected, 
raicrofuged for 15 min, washed with 70% ethanol, and dried. 

A mixed probe was prepared as described in Example 6 with the following 

20 differences: (i) the probe is unlabeled, thus, only a single mixture need be prepared; and 
(ii) the protruding strand consisted of three nucleotides such that each of the three 
positions in the protruding strand could be A, C, G, or T, i.e. each was "N" as described 
above. Ligation was carried out as follows: To a 0.5 ml tube containing the dried DNA 
was added 20.5 |il probe (100 pmoles/(xl), 2.5 |il lOx ligase buffer (New England 

25 Biolabs), 2.0 \il ligase (New England Biolabs, 0.4 units/[il). The mixture was incubated 
for 1 hour at 16°C, after which the DNA was purified on a spin column prepared as 
follows: resin was swelled with 800 |il H2O for 45 min, drained, and spun at 800 rpm 
for 2 min. 

The labeled terminator was excised from the ligated complex with the 3'-»5' 
30 exonuclease activity of Deep Vent DNA polymerase. At the same time, the polymerase 
extends the strand the length of the probe, thereby repairing the nick caused by the 
presence of the dideoxy terminator. The reaction was carried out in a MicroAmp tube 
(Perkin-Elmer) containing the following: 25.0 |il DNA, 3.5 [il lOx nucleoside 
triphosphates (1.25 mM each), 3.5 ^1 lOx Vent buffer (New England Biolabs), and 2.0 
35 ^1 Deep Vent DNA polymerase (2 units/[il). The mixture was incubated for 60 min at 
80°C under oil, after which 15 ml H2O was added and the combined mixture was 
extracted with 100 \il chloroform. The aqueous phase was removed and mixed with 5 ji 
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1 3 M NaOAc and 125 |il ethanol, after which the precipitate was collected, microfuged 
for 15 min, washed with 70% ethanol, and dried. 

Fok I cleavage was carried out by resuspending the DNA in 21.5 |xl H2O and 
adding 2.5 \il lOx Fok I buffer (New England Biolabs) and 1.0 *ll Fok I (4 units/ml). 
5 The mixture was incubated for 15 min at 37°C, after which the DNA was purified on a 
spin column prepared as described above. 

Results are shown in Figures 5a through 5c. The colors of the curves generated 
by the GeneScan software containing the dominant peaks in the figures corresponded to 
the correct nucleotide in the target polynucleotide. 



10 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



(i) APPLICANT: Sydney Brenner 



(ii) TITLE OF INVENTION: DNA Sequencing by Stepwise 
Ligation and Cleavage 



(iii) NUMBER OF SEQUENCES: 16 



(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Stephen C. Macevicz, Lynx 
Therapeutics, Inc. 

(B) STREET: 3832 Bay Center Place 

(C) CITY: Hayward 

(D) STATE: California 

(E) COUNTRY: USA 

(F) ZIP: 94545 



(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: 3.5 inch diskette 

(B) COMPUTER: IBM compatible 

(C) OPERATING SYSTEM: Windows 3.1 /DOS 5.0 

(D) SOFTWARE: Microsoft Word for Windows, vers. 2.0 



(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 



(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/222,300 

(B) FILING DATE: 04-APR-94 



(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/280,441 

(B) FILING DATE: 25-JUL-94 



(viii) ATTORNEY / AGENT INFORMATION: 

(A) NAME: Stephen C. Macevicz 

(B) REGISTRATION NUMBER: 30,285 

(C) REFERENCE / DOCKET NUMBER: slc3 



(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (415) 638-5552 

(B) TELEFAX: (510)670-93 02 



(2) INFORMATION FOR SEQ ID NO: 1: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 



AGTGAATTCG AGCTCGGT 



18 



(2) INFORMATION FOR SEQ ID NO: 2: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



CCTTTGAGTG AGCTGATA 18 



(2) INFORMATION FOR SEQ ID NO: 3: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



AATTCGAGCT CGGTACCCGG GGATCCTCTA 30 



(2) INFORMATION FOR SEQ ID NO: 4: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 



ATCGGATGAC ATCAAC 16 



(2) INFORMATION FOR SEQ ID NO: 5: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 18 nucleotides 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
ACTCCTTCNN NNTACCGG 18 

(2) INFORMATION FOR SEQ ID NO: 6: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
AGACGCCATC ATNNNNTACC GG * 22 

(2) INFORMATION FOR SEQ ID NO: 7: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
CAGATCCTCT TCA 13 

(2) INFORMATION FOR SEQ ID NO : 8: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
CAGATCCTCT TC 12 

(2) INFORMATION FOR SEQ ID NO: 9: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



CTAAACCATT GGTATGGGCC AGTGAATTGT AATA 



(2) INFORMATION FOR SEQ ID NO: 10: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 55 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



CGCGCAGCCC GCATCGTTTA TGCTACAGAC TGTCAGTGCA 
GCTCTCCGAT CCAAA 



(2) INFORMATION FOR SEQ ID NO: 11: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



GNNNTACGTG CGCATCCCGA GC 



(2) INFORMATION FOR SEQ ID NO: 12: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



TNNNTACGTG CGCATCCCGA GC 
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(2) INFORMATION FOR SEQ ID NO: 13: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



CNNNTACGTG CGCATCCCGA GC 



(2) INFORMATION FOR SEQ ID NO: 14: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 



ANNNTACGTG CGCATCCCGA GC 



(2) INFORMATION FOR SEQ ID NO: 15: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 



CCCGACGTCG CATGCTCCTC TA 22 



(2) INFORMATION FOR SEQ ID NO: 16: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 



GCGCGTTGGC CGATTCATTA 
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I claim: 

1. A method for determining a nucleotide sequence of a polynucleotide, the method 
5 comprising the steps of: 

(a) ligating a probe to an end of a polynucleotide, the probe having a nuclease 
recognition site; 

(b) identifying one or more nucleotides at the end of the polynucleotide; and 

(c) cleaving the polynucleotide with a nuclease recognizing the nuclease 

10 recognition site of the probe such that the polynucleotide is shortened by one or more 
nucleotides. 

2. The method of claim 1 further including the step of repeating said steps (a) 
through (c) until said nucleotide sequence of said polynucleotide is determined. 

15 

3. The method of claim 2 wherein said nuclease is a type lis restriction 
endonuclease. 

4. The method of claim 3 further including a step of blocking recognition sites of 
20 said nuclease on said polynucleotide. 

5. The method of claim 4 wherein said step of ligating includes treating said 
polynucleotide and said probe with a ligase. 

25 6. The method of claim 5 wherein said polynucleotide has a protruding strand at at 
least one end and wherein said probe has a protruding strand at one end, the protruding 
strand of said probe being complementary to the protruding strand at one end of said 
polynucleotide. 

30 7. The method of claim 6 wherein said protruding strand of said polynucleotide has 
a 5'-phosphoryl group and wherein said complementary protruding strand of said probe 
lacks a 5'-phosphoryl group. 

8. The method of claim 7 wherein said step of ligating includes treating said 
35 polynucleotide and said probe in succession with (i) a ligase to ligate said protruding 
strand having said 5-phosphoryl group to said probe, (ii) a kinase to phosphorylate said 
complementary protruding strand of said probe, and a ligase to ligate said 
complementary protruding strand of said probe to said polynucleotide. 
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9. The method of claim 6 wherein said step of ligating includes providing said 
probe as a mixture such that said complementary protruding strand of said probe 
includes every possible sequence of nucleotides the length of said protruding strand. 

5 

10. The method of claim 6 further including the step of removing unligated probe 
from said polynucleotide after said step of ligating. 

11. The method of claim 6 wherein said step of identifying includes identifying a 

1 0 nucleotide in said protruding strand of said polynucleotide by the identity of said probe 
ligated thereto. 

12. The method of claim 1 1 further including the step of capping said polynucleotide 
which fails to ligate to said probe. 

15 

13. The method of claim 12 wherein said step of capping includes extending said 
polynucleotide with a DNA polymerase in the presence of chain-terminating nucleoside 
triphosphates. 

20 1 4. The method of claim 1 3 wherein said chain-terminating nucleoside triphosphates 
are dideoxynucleoside triphosphates. 

15. The method of claim 6 wherein said step of identifying includes identifying a 
nucleotide in said protruding strand of said polynucleotide by extending a strand of said 
25 polynucleotide or said probe with a nucleic acid polymerase in the presence of chain- 
terminating nucleoside triphosphates. 



30 



16. The method of claim 15 wherein said step of identifying further includes 
extending a strand of said polynucleotide. 

17. The method of claim 16 wherein said chain-terminating nucleoside triphosphates 
are labeled. 



18. The method of claim 15 wherein said step of identifying further includes 
35 extending a strand of said probe and wherein said chain-terminating nucleoside 
triphosphates are labeled. 
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19. The method of claim 1 wherein said polynucleotide has a protruding strand at at 
least one end and wherein said probe has a protruding strand at one end, the protruding 
strand of said probe being complementary to the protruding strand at one end of said 
polynucleotide. 

5 

20. The method of claim 19 further including the step of repeating said steps (a) 
through (c) until said nucleotide sequence of said polynucleotide is determined. 

2 1 . The method of claim 20 wherein said step of ligating includes treating said 
10 polynucleotide and said probe with a ligase. 

22. The method of claim 21 wherein said nuclease is a type lis restriction 
endonuclease. 



15 23. The method of claim 22 wherein said step of ligating includes providing said 
probe as a mixture such that said complementary protruding strand of said probe 
includes every possible sequence of nucleotides the length of said protruding strand. 

24. The method of claim 23 further including the step of removing unligated probe 
20 from said polynucleotide after said step of ligating. 

25. The method of claim 1 wherein said step of identifying includes identifying a 
nucleotide in said protruding strand of said polynucleotide by extending a strand of said 
polynucleotide or said probe with a nucleic acid polymerase. 

25 

26. The method of claim 25 further including the step of repeating said steps (a) 
through (c) until said nucleotide sequence of said polynucleotide is determined. 

27. The method of claim 26 wherein said step of identifying further includes 
30 extending a strand of said polynucleotide in the presence of chain-terminating 

nucleoside triphosphates. 

28. The method of claim 27 wherein said chain-terminating nucleoside triphosphates 
are labeled. 

35 

29. The method of claim 28 wherein said chain-terminating nucleoside triphosphates 
are labeled with fluorescent dyes. 
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30. The method of claim 29 wherein said fluorescent dyes have spectrally resolvable 
fluorescence emission bands. 

31. The method of claim 1 wherein said polynucleotide has a protruding strand at 

5 one end and is attached to a solid phase support by another end and wherein said probe 
has a protruding strand at one end, the protruding strand of said probe being 
complementary to the protruding strand at one end of said polynucleotide. 

32. The method of claim 31 further including the step of repeating said steps (a) 
10 through (c) until said nucleotide sequence of said polynucleotide is determined. 

33. The method of claim 32 wherein said step of ligating includes treating said 
polynucleotide and said probe with a ligase. 

1 5 34. The method of claim 33 wherein said nuclease is a type lis restriction 
endonuclease. 

35. The method of claim 34 wherein said step of ligating includes providing said 
probe as a mixture such that said complementary protruding strand of said probe 

20 includes every possible sequence of nucleotides the length of said protruding strand. 

36. The method of claim 35 further including the step of removing unligated probe 
from said polynucleotide after said step of ligating. 

25 37. The method of claim 36 wherein said step of identifying includes identifying a 
nucleotide in said protruding strand of said polynucleotide by extending a strand of said 
polynucleotide with a nucleic acid polymerase in the presence of chain-terminating 
nucleoside triphosphates. 

30 38. A method for determining a nucleotide sequence of a polynucleotide, the method 

comprising the steps of: 

(a) ligating a probe to an end of a polynucleotide having a protruding strand to 

form a ligated complex, the probe having an end with a complementary protruding 

strand to that of the polynucleotide and the probe having a nuclease recognition site; 
35 (b) cleaving the ligated complex with a nuclease, the nuclease recognizing the 

recognition site and cleaving the ligated complex such that an augmented probe is 

released leaving a protruding strand on the polynucleotide; 
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(c) identifying one or more nucleotides in the protruding strand of the 
polynucleotide; and 

(d) repeating steps (a) through (c) until the nucleotide sequence of the 
polynucleotide is determined. 

5 

39. The method of claim 38 wherein said nuclease is a type lis restriction 
endonuclease and wherein said polynucleotide is provided with recognition sites of said 
nuclease blocked. 

10 40. The method of claim 39 wherein said recognition sites of said polynucleotide are 
blocked with a methylase, 

41 . The method of claim 39 wherein said step of ligating includes treating said 
polynucleotide with a ligase. 

15 

42. The method of claim 4 1 wherein said protruding strand of said polynucleotide 
has a 5-phosphoryl group and wherein said complementary protruding strand of said 
probe lacks a 5-phosphoryl group. 

20 43. The method of claim 42 wherein said step of ligating includes treating said 
polynucleotide and said probe in succession with (i) a ligase to ligate said protruding 
strand having said 5-phosphoryl group to said probe, (ii) a kinase to phosphorylate said 
complementary protruding strand of said probe, and a ligase to ligate said 
complementary protruding strand of said probe to said polynucleotide. 

25 

44. The method of claim 43 wherein said step of identifying includes identifying a 
nucleotide in said protruding strand of said polynucleotide by the identity of said probe 
ligated thereto. 

30 45. The method of claim 41 wherein said polynucleotide is attached to a solid phase 
support. 

46. The method of claim 45 wherein said step of ligating includes providing said 
probe as a mixture such that said complementary protruding strand of said probe 
35 includes every possible sequence of nucleotides the length of said protruding strand. 



47. The method of claim 46 further including the step of removing unligated probe 
from said polynucleotide after said step of ligating. 
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20 



25 



30 



48. The method of claim 47 wherein said step of identifying includes identifying a 
nucleotide in said protruding strand of said polynucleotide by the identity of said probe 
ligated thereto. 

49. The method of claim 47 wherein said step of identifying includes identifying a 
nucleotide in said protruding strand of said polynucleotide by extending a strand of said 
polynucleotide or said probe with a nucleic acid polymerase in the presence of chain- 
terminating nucleoside triphosphates. 

50. The method of claim 47 wherein said solid phase support is a raicroparticle. 

5 1 . The method of claim 50 wherein said type lis restriction endonuclease is Fok I. 

52. A method for determining a nucleotide sequence of a polynucleotide, the method 
comprising the steps of: 

(a) providing a polynucleotide in double stranded form such that the 
polynucleotide is attached to a solid phase support and has a protruding strand at one 
end; 

(b) ligating a probe to the protruding strand of the polynucleotide to form a 
ligated complex, the probe having an end with a complementary protruding strand to 
that of the polynucleotide and the probe having a type lis endonuclease recognition site; 

(c) identifying a nucleotide in the protruding strand of the polynucleotide by the 
identity of the ligated probe; 

(d) cleaving the ligated complex with a type lis endonuclease that recognizes the 
type lis endonuclease recognition site and cleaves the ligated complex such that an 
augmented probe is released leaving a new protruding strand on the polynucleotide; and 

(e) repeating steps (a) through (d) until the nucleotide sequence of the 
polynucleotide is determined. 

53. The method of claim 52 wherein said probe comprises a first single stranded 
oligonucleotide and a second single stranded oligonucleotide, the first single stranded 
oligonucleotide having an end with complementary nucleotides to those in said 
protruding strand of said polynucleotide and the second single stranded oligonucleotide 
being complementary to a portion of the first single stranded oligonucleotide such that 
the first and second single stranded oligonucleotides are capable of forming a duplex 
containing a type lis endonuclease recognition site, and wherein said step of ligating 
includes (i) annealing the first single stranded oligonucleotide to said protruding strand 
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of said polynucleotide under conditions that promote the formation of a perfecdy 
matched duplex therebetween, (ii) ligating the first single stranded oligonucleotide to 
said polynucleotide, (iii) annealing the second single stranded oligonucleotide to the first 
single stranded oligonucleotide, and (iv) ligating the second single stranded 
5 oligonucleotide to said polynucleotide. 

54. The method of claim 53 wherein said step of ligating includes providing said first 
single stranded oligonucleotide as a mixture such that said complementary nucleotides in 
said end of said first single stranded oligonucleotide includes every possible sequence of 

10 nucleotides the length of said end. 

55. The method of claim 54 further including the step of removing unligated said 
first and second single stranded oligonucleotides from said polynucleotide after said step 
of ligating. 

15 

56. The method of claim 55 wherein said probe comprises four components, each 
component being capable of indicating the presence of a different nucleotide in said 
protruding strand of said polynucleotide upon ligation. 

20 57. The method of claim 56 wherein each of said components of said probe is 

labeled with a different fluorescent dye and the different fluorescent dyes are spectrally 
resolvable. 

58. A method for determining a nucleotide sequence of a polynucleotide, the method 
25 comprising the steps of: 

(a) providing a polynucleotide in double stranded form such that the 
polynucleotide is attached to a solid phase support and has a protruding strand and a 
recessed strand at one end; 

(b) identifying a nucleotide in the protruding strand of the polynucleotide by 
30 extending the recessed strand with a nucleic acid polymerase; 

(c) ligating a probe to the one end of the polynucleotide, the probe having a type 
lis restriction endonuclease recognition site; 

(d) cleaving the polynucleotide with a type lis restriction endonuclease that 
recognizes the type lis endonuclease recognition site leaving a new protruding strand on 

35 the polynucleotide; and 

(e) repeating steps (a) through (d) until the nucleotide sequence of the 
polynucleotide is determined. 
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59. The method of claim 58 wherein said nucleic acid polymerase extends said 
recessed strand in the presence of chain-terminating nucleoside triphosphates. 

60. The method of claim 59 further including the step of removing unligated probe 
5 from said polynucleotide after said step of ligating. 

6 1 . The method of claim 60 further including a step of blocking recognition sites of 
said type As restriction endonuclease on said polynucleotide. 

10 62. The method of claim 6 1 wherein said recognition sites of said polynucleotide are 
blocked with a methylase. 

63. The method of claim 6 1 wherein said probe has a 5' protruding strand one 
nucleotide less in length than said protruding strand of said polynucleotide and wherein 

15 said step of ligating includes providing said probe as a mixture such that the protruding 
strand of the probe includes every possible sequence of nucleotides the length of the 
protruding strand. 

64. The method of claim 63 wherein said chain-terminating nucleoside triphosphate 
20 is a labeled dideoxynucleoside triphosphate and wherein said step of identifying includes 

identifying said one or more nucleotides by the label on the labeled dideoxynucleoside 
triphosphate incorporated into said recessed strand of said polynucleotide. 

65. The method of claim 64 further including the steps of excising said labeled 
25 dideoxynucleotide and extending said recessed strand with a nucleic acid polymerase. 

66. The method of claim 65 wherein said step of excising is carried out with T4 
DNA polymerase in the presence of deoxyribonucleoside triphosphates. 

30 67. A double stranded nucleic acid probe comprising: 
a first oligonucleotide strand; 

a second oligonucleotide strand such that the first and second oligonucleotide 
strands form a perfecdy matched duplex in a duplex forming region and such that the 
second oligonucleotide strand forms a protruding strand with respect to the duplex 
35 forming region, the protruding strand including every possible sequence of nucleotides 
the length of the protruding strand; and 

a nuclease recognition site within the duplex forming region. 
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68. The double stranded nucleic acid probe of claim 67 wherein said nuclease 
recognition site is the recognition site of a type lis restriction endonuclease having a 
recognition site and a cleavage site and wherein the cleavage site is located outside of 
said duplex forming region. 

5 

69. A method of determining the zygosity of an individual at a predetermined 
genetic locus having a plurality of allelic forms of DNA, the method comprising the 
steps of: 

(a) providing a sample of the DNA from the predetermined genetic locus such 
10 that the sample of DNA comprises polynucleotides, each polynucleotide of the sample 

having a protruding strand and a recessed strand; 

(b) ligating a probe to an end of each polynucleotide to form one or more ligated 
complexes, the probe having a nuclease recognition site; 

(c) identifying the kind and relative abundance of nucleotides in the protruding 
15 strand of the polynucleotide; 

(d) cleaving the ligated complexes with a nuclease; and 

(e) repeating steps (b) through (d) until the nucleotide sequences of the 
polynucleotides of the genetic locus are determined. 

20 70. The method of claim 69 wherein each of said polynucleotides is attached to a 
separate solid phase support or a separate region of the same solid phase support. 

7 1 . The method of claim 70 wherein said nuclease is a type lis restriction 
endonuclease and wherein said step of identifying includes identifying a nucleotide in 

25 each of said protruding strands of said polynucleotides by extending a strand of each of 
said polynucleotides with a nucleic acid polymerase in the presence of chain-terminating 
nucleoside triphosphates. 

72. The method of claim 7 1 further including the step of removing unligated probe 
30 from said polynucleotide after said step of ligating. 

73. The method of claim 72 wherein said chain-terminating nucleoside triphosphates 
are labeled dideoxynucleoside triphosphates and wherein said step of identifying 
includes identifying said nucleotide by the label on the labeled dideoxynucleoside 

35 triphosphates incorporated into said recessed strand of said polynucleotide. 

74. The method of claim 73 further including the steps of excising said labeled 
dideoxynucleotides and extending said recessed strands with a nucleic acid polymerase. 
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75. The method of claim 74 wherein said step of excising is carried out with T4 
DNA polymerase in the presence of deoxyribonucleoside triphosphates. 

5 76. A kit for determining the nucleotide sequence of a polynucleotide, the kit 
comprising: 

a probe capable of being ligated the an end of a polynucleotide to form a ligated 
complex, the probe having a type lis restriction endonuclease recognition site; 

a type Us restriction endonuclease capable of recognizing the type lis restriction 
1 0 endonuclease recognition site of the probe. 

77. The kit of claim 76 further comprising a ligasefor ligating said probe to said end 
of said polynucleotide. 

15 78. The kit of claim 77 further comprising a first reaction buffer for said ligase and a 
second reaction buffer for said type lis restriction endonuclease. 

79. The kit of claim 78 further comprising: 
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a nucleic acid polymerase; 

labeled chain-terminating nucleoside triphosphates; and 
a third reaction buffer for the nucleic acid polymerase. 
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